You are using a version of internet explorer not supported by this system. Please update your browser to Google Chrome to continue.

Data Collection - Top 6 Tips from a Data Management Perspective!

27th February 2022 in Blogs by Andy Madeley

It’s been close on 6-weeks since the last article and the main reason for the delay in posting this article is that I have been heavily involved with data collection related activities across various projects. The last article focused on “Questionnaire Design” from a “Data Management” point of view, so it only seems apt to now transition to the next activity area, “Data Collection”. Hope you enjoy the read!

As I have repeatedly mentioned to anyone that knows me, I always advocate that for a survey project to be given a chance to succeed then all ducks must be in a row with regard to the pre-field activities. Last time around I talked about teamwork through excellent questionnaire design engagement.  Remembering that the questionnaire will be passed onto a scriptwriter, so give the scriptwriter the best chance possible to bring the questionnaire to life as they input the questionnaire into the chosen data collection application.

Before I move onto tip 1 – which is probably the most important but the most forgotten about, there are no excuses regarding building a good quality questionnaire that is brimming full of information for the interviewers, for the script writers, for everyone! Give your project the best chance by delivering a “quality” questionnaire.

TIP 1 – Form a solid relationship with the scriptwriting / scripting team working on your project

I still think there are times when researchers forget that someone else, the script writer, needs to interpret their survey requirements. To transform that paper-based document into a living, breathing collection “script” is important. I have worked on surveys that cost hundreds of thousands to deploy and have also worked on small postcard sized projects, with 1-hour scriptwriting requirement.

I have rarely seen interactive conversations between research staff and scriptwriters though, from my early career days working at research agencies in the UK, to much further afield on some of the most intriguing and challenging survey landscapes. From Nepal to Northampton or from Vanuatu to Chicago, there is no more delicate challenge than making sure the scriptwriter fully understands the questionnaire.

I will say, and I have experienced this more predominantly on international tracker projects – the scriptwriting teams sitting in the offices at the local field agencies will rarely feedback any concerns or issues, unless politely pushed into communicating.

I call it the “black hole” period – the period when a researcher hands over their well-crafted questionnaire to the scriptwriting, who must make sure the interviewers or recipients fully understand what is being asked and how it is being asked.

In today’s survey world, surveys are getting more complicated. I am a scriptwriter by vocation, and have crossed over into the research “pre-field” environment more. There is nothing more satisfying to me than to talk to my scriptwriter, talk through the questionnaire, make them aware I expect questions or queries and I talk…  I promise it works!

TIP 2 – Script check in 3-distinct phases

Tip 1 was the most important, hence I wrote a lot on that tip point. I will keep this, and the next 4 tips much more succinct. Please reach out to me if you want to know more though.

The one area a scriptwriter and a research person will meet and share notes is when they check the script. Irrelevant on what type of device; laptop / PC via an emulator or a survey link, a tablet or a smart phone there are 3 types of checks. There is the “answer yes” to everything phase, then there is the “answer no” to everything phase and then there is the “go random” phase.

Check 1 – starting at the first survey question, answer yes (or the most positive answer) and do that throughout the script. You are not testing / checking the routing per se, you are making sure your questions are presented exactly as intended and you are also making sure every question appears!

Check 2 – starting at the first survey question again, answer no (or the most negative answer) for nearly everything. This is called a boundary test. You are making sure that the “no” route makes sense and here you are checking the routing for sure.

Check 3 – starting at the first survey question, run 3 or 4 random checks. Click any answer at each question. Does this make sense both routing wise (logically) and research wise (contextually, re: flow).

If you want to know more, then let’s talk!

TIP 3 – Whenever possible generate dummy data

The script checks in “tip 2” will only check so many routing permutations. If you want to be thorough, always ask for dummy data to be generated. Most recognised data collection applications have this module either built-in or as a requested add-on.

As per the example above, if you have correctly built dummy data, then the frequency percent values for every category should be more or less the same. This is a really good check to carry out and screen for. There is a clear and precise way of using dummy data to check filter logic. The key is to first see how many people have answered a question. Then, determine how many people “should have” answered this question. If you are running dummy data QC checks in SPSS then run a frequency of the question with all data selected. Then, “select data” and apply the “filter” employed on the question and next, run a frequency on any question that is asked of all respondents (i.e. has no filters). This approach tells you how many people are answering and how many people should be answering. Voilá!

TIP 4 Use “Log” files to share issues found

When I was either scripting a project on behalf of a client or training scripting teams, my pet hate was when we were sent e-mail after e-mail with information on potential glitches in the survey. The log files I use are simply MS Excel document files. There is nothing clever or sophisticated about this approach – just using plain, clear and sometimes “in detail” responses. The more information you share, the better! 

I know a lot of you are probably thinking “Google Sheets”. I am not a big fan only because this can become as unstructured as e-mail delivery. Tracking isn’t the best in this format. It’s always good to have a log file gatekeeper. I’d be glad to share my best practice approach on using log files, if you want to know more. It has never failed me or my team and we always find something to report on!

TIP 5 – CAPI, CATI and CAWI hardware is important – so accommodate it skilfully!

Is the “screen” or the “delivery” device suitable for your question or question set. What about if you have a series of 10-statements each with a 5-point Likert  scale? When I design questionnaires, my first question is “how are we collecting, on what platform”, my second question is “is this self-completion or interviewer-led” and the third question is “do we need to show the respondent any words or graphics, or share any video or sound bites”.

The hardware being used should always drive the questionnaire structure. Personally, for CAPI projects, I like that each of the 10-statements is shown individually page by page if a 5-point scale or more. If a 2-, 3- or 4- point scale, then using a grid works. I also think 10-inch screens should be the default across CAPI or CAWI. Etc… lots to think about and maybe a theme for a future article!

TIP 6 – Test, test and test more…

I have participated in so many projects whereby me and my team were only involved once the data had been collected. At that stage, if an issue is found (big or small) then you can only firefight and try to reduce the impact of the fire. Sometimes this results in “data cleaning” and in other more extreme cases, questions removed, questions re-fielded or the survey re-fielded or thrown out.

Our statistics show that the more testing carried out at the pre-field stage, the more chance that the survey will not require any (or very minor) data cleaning. I despise “edits”. The bane of a data processing analyst’s day / time on a project!

Testing starts with the questionnaire – dry run it on paper! Then script check, pattern check translations if required, QC check dummy data, QC check pilot data and then right before fieldwork starts, carry out one last script check.

We’ve been quite successful over the last 2-3 years with our translation pattern check QC approach. We don’t need to know the language and we don’t need to run AI algorithms. My philosophy is that it’s always good (and does no harm) to check as much as you can at the pre-field stage. I am no Amharic language expert, but I can still deliver invaluable comments that help with awareness and peace of mind.


That’s it for this article. You must have questions, no? We’re not magicians or mind-readers in the data management teams and some of us are quite introverted too. Break down barriers, build relationships and do everything you can to test and check before fieldwork starts. We have 20-years of experience that proves that more effort on pre-field equals quicker turnaround times once fieldwork ends and less firefighting. I call that heavenly data management!

My e-mail address is Drop me an e-mail at any time if you want to discuss any of my tips and thoughts (based on my own experiences) in more detail!


Feature Releases (16)

Tutorials (7)

Blogs (6)



analysis branding crosstab csv cx dashboards data analysis data collection data management data visualization dataviz ddg dummy data generation installed software market research online analysis online data visualiszation power bi processing questionnaire design reporting segmentation spss survey data swisspeaks tabx tracker

Register Free

Lifetime license, 1000 rows / 200 variables per project, 2 project slots.

This website uses cookies. Privacy Policy