Even if you don’t agree with the statement within the title of the article wholly, I’d like to think that by the time you have finished reading this article that I’ve been able to twist your arm a little more. Also, please note, as much as my content here relates to my work within the quantitative project world, it is still relatable in parts to the qualitative project world. So, let’s start my sharing my 4 rules surrounding “data quality”. 1 = never assume, 2 = ask questions, 3 = you will always find something wrong, so find it and 4 = build a friendly relationship with the collection scripting team!
Before exploring my 4 rules more, please allow me to deviate briefly to confirm that the whole topic of data quality and data management from an academic point of view can make for an exceptionally dry read – even for me! Hopefully though, I can give you some information here that you can take away and reflect on in relation to your own project management operations.
To kickstart, some information about me that hopefully validates my credentials for talking around this subject matter.
I have worked on some fascinating projects over the years, my personal “passion” area over the last few years has been media or social research focussed projects in developing or post conflict countries. I like to travel and learn about culture and this niche area has allowed me to tick that box BUT it has also allowed me to learn more about data quality around the world.
Many years ago, my career started in the PAPI (paper-pen) and CATI collection world, then as MR evolved through the “noughties” (2000s) I worked more heavily on CAWI collection platforms. Through the “10s” or “teensies” (2010s) into the fascinating world of CAPI. Now in the “20s” (2020s and no official nickname as yet!) I’ve been focussing more heavily on the analysis and dissemination stages.
I apply the same data quality and data management approach to all collection platforms and to all project stages. I am now recognised (I believe!) as a data quality expert amongst my peers. I really enjoy dissecting a project, learning about it from both logistical and content points of view and adding my know-how to help better design, collect, collate, analyse and disseminate (share) the “data” associated to that project. For those of you who are also “Bake Off: The Professionals” fans like me, then when I say “Love it, love it, love it!” you can see why I say this. I really do have a passion for working with data, then sharing that passion by making sure a project is nurtured into something shareable.
I’m sure you’ve all heard the phrase “garbage in, garbage out” at some time! Sadly, I’ve seen too many GIGO projects. In the earlier days of selling the company data quality and data management package, we were perceived as a luxury. “Yes, that would be great if you could quality control (QC) the data once the fieldwork period is complete, then run some crosstabs for us and send to the project research manager.” This was a typical reply through the “noughties”. Unfortunately, this attitude (of its time) only meant one thing – we became firefighters, trying to put out the fire and then too, the paramedics thereafter, trying to repair (and clean) the data into some presentable form without artificially skewing. You can do nothing more than stick a plaster on the data if your data quality and data management does not start earlier than once fieldwork has finished. So, after the longish lead in, back to my 4 points. I hope the scene is now set!
Points 1, 2 and 3 – “Never Assume”, “Ask questions” and “You will always find something wrong!”
Any MR (quant survey) project goes through the following 4 stages; design, collection, collation and analysis. When I have given training to field teams in the past, I always start by saying “Never assume anything, and you have a voice – use it!”.
SwissPeaks (and now Tabx) have been working with 2 large “end client” companies for 10+ years now and they have allowed me to personally jump on a plane and to oversee and give training, as needed, to the local field agencies who are carrying out the CAPI (or PAPI) data collection for them. With any given project of this type, you will have an end-client, a research agency and a field agency. Within the research or more typically the field agency, you will then have scripters, interviewers (enumerators), supervisors, translators, coders and research managers. For me, the central (fulcrum) person to any project and its success, is the “scripter”.
The scripter has to take a questionnaire supplied by the research agency and to copy/paste and work their magic to remaster the questionnaire into an interview (or data entry) script. A good scripter should never assume anything! If the questionnaire doesn’t make sense, they should ask questions but sadly, this is seldom the case and the scripter will tend to “assume” that they know what the research team (the questionnaire designers) want. And this is where things start to go wrong.
My rule is that if a field agency asks no (or very little!) questions, then be worried! If a research agency asks no questions to the end-client, then this is another worry scenario. It’s normally at the field agency level where such problems occur as the research agency and/or the research manager at the field agency are not talking to the scripter and vice versa. Everyone starts assuming. Filter logic, masking, dynamic text substitution and question block scenarios normally start to unravel at this point.
We have a clear mandate re: QC within our company. We report our issues using MS Excel sheets – we call them review files. We do not apply tracked changes to MS Word questionnaire files and we rarely transmit QC information in e-mail threads.
We will always find issues with a questionnaire, with a codebook, with a collection script or even within a translation. For translation QC, we simply pattern match. A good place to start with translation QC is with the wording “Don’t Know” – check this across the translation, you might find different translations for this. We have, in the past! This could suggest there are 2 or more translators working on a script and if they cannot agree on a straightforward phrase like “Don’t Know” then maybe there are more worrying inconsistencies. As I said “Never Assume”, “Ask Questions” and “You will always find something wrong!”. On this last statement “You will always find something wrong!” – this is a somewhat arbitrary statement, maybe better to say “You will always find something that has made you pause, think and query. If so, write it down, then ask about it; “Never Assume”.
As much as I have focussed my working scenarios around CAPI and PAPI above, it’s the same for CATI or CAWI! Maybe some very specific differences only, but the principle is the same.
Point 4 – build a friendly relationship with the collection scripting team
I always like to know who I am dealing with by name and what I am dealing with re: the platform being used. I then like to introduce myself, explain that I am supporting, assisting and throwing in a recommendation or two, but in no way am I the “scripting police”. I share some of my background and once we have a relationship, I have found there is a two-way dialogue with the scripter and we iron out as much ambiguity, uncertainty, complexity and researcher nuance as we can together. The scripter scripts, me and my team test via as many ways as allowed and between us we dry-run the questionnaire too. It’s a simple remedy but it works! It’s not per se about the personal touch, it’s simply the fact that the scripter is “key” and should be treated as a peer, and an important peer at that. Getting them to know you means they will feel more comfortable talking more. The same can be said of the supervisors, interviewers and translators, if possible. With these groups this is a more difficult challenge, so I always make sure that the scripter and I are on the same page from the get-go!
You might have thought this article was going to be more techy (or geeky). Nope! The number one rule for any given project is to remember that “Data quality and data management are always at the heart of any MR or M&E project” and as such, the golden rule that encompasses all my 4-points is “communication”.
With my 20+ years’ experience, I can normally tell if a scripter or data management person is good, bad or ugly by the way they write an e-mail or reply to my review comments. If they can’t communicate well, you might be in trouble. Get to know them and as some of you have probably heard me say before, the “data” staff do NOT work in a black hole. Scripting and questionnaire design can be QC’d to a very high standard nowadays and this then supports the designers and scripters alike and breeds confidence and gives peace of mind. Just don’t get complacent and assume though!
Back in 2012 I presented much of what I have shared above at a conference. If of interest and you want to view this, then please click on this link. https://www.dropbox.com/s/eksi3dz114mrszn/Field_Agency_Presentation.pdf?dl=0
At the time, I could see that people were wondering if I was actually preaching to the converted or even, patronising them. The same rules applies now as then and as much as every company will insist they are great at QC and data management, peace of mind through asking questions and never assuming, for me is a much better practice than believing at face value.
Hope you have enjoyed this longer article. By the way, this article is nothing directly to do with Tabx on this occasion other than a) to tell you a little more about me as a co-founder of Tabx and b) to bring to your attention, if interested in Tabx, to avoid a “garbage in, garbage out scenario” as we cannot work magic on poor data. We can impress the world with any data that has been collected and collated properly though!
If you want a one-to-one session on data quality and data management at any time too, I’d be happy to talk directly with you.
Written by Andy Madeley, July 2021