Technology is a wonderful thing, and it can be quite fairly stated that it and people’s nearly ubiquitous access to the internet have completely – and rapidly – transformed survey research. Marketing researchers and social scientists can field surveys across social media platforms, via mobile devices, to online panels. Alas, there is also a downside to this technological access. Data collected online is increasingly contaminated with fraudulent responses. The cause of this? Bots.

As reported in The Atlantic, 52% of all 2018 web traffic is generated by bots. That is particularly noteworthy because, in 2017, human traffic had overtaken bots for the first time since 2012. Further, most of these bots are not the friendly, helper bots we would like them to be. Helper bots make up 23% of all web traffic, which leaves 29% of the traffic being generated by harmful bots. Think about that. One in two web visitors is not human, and one in three web visitors is a harmful bot.

According to the Fors/Marsh Group, bots are “simple computer programs that impersonate people by interacting with computer systems, automating tasks, and independently completing a wide range of online operations. The sophistication, ease of use, and range of tasks that bots can complete is growing rapidly—while the costs are steadily declining. Bots can be very useful for web scraping, monitoring websites, or aggregating news; however, they are destructive when they are used to impersonate humans.”

Bots can easily damage the accuracy of survey research when used by those bad actors who want to receive the digitally provided compensation for participation in the survey. Unscrupulous online panel members can easily obtain bots to take surveys. Even the most well-established and reputable online panel companies have been afflicted with bots, and the problem doesn’t seem to be getting smaller.

So, if bots aren’t going away, how can marketing researchers protect their surveys against this fraudulent data? We’ve already been dealing with low-quality responses from participants speeding or straight-lining through the survey, how can we protect ourselves against bots? Here are some suggestions:

  1. Shut the door! The best way to deal with bots is to prevent them from getting your surveys in the first place. Work with reputable panel providers who understand the risk of bots, and who take measures to identify and stop them. Name and address validation is one way to identify a bot. Mailing incentives to physical addresses is another way to make your panel bot-proof.
  2. Increase your data quality checks. Even beyond speeders and straight-liners, you must take additional steps to evaluate your data quality. Tag between-item data inconsistencies (e.g., young people with improbably high incidence of certain activities), note all suspicious open-ends (duplicate or off-topic responses), and so on. Add some trap questions into your survey. You can route respondents who don’t answer them correctly out of the survey at that point or remove them from the data afterward.
  3. Check the IP and time signature. Use an IP look-up tool to identify suspicious sources of respondents. After you identify them, check out their origin. You will probably find that many of them originate from a relatively small number of ISPs with strange names and in strange Blacklist these ISPs from your data and get rid of them all.
  4. Use a probability-based sample: A probability-based sample (i.e., everyone in the desired population has an equal chance of being selected) is significantly more expensive and harder to recruit if your population is difficult to reach; however, the quality will be higher, and there will be less chance of bots.
  5. Learn to love CAPTCHA. No one likes those CAPTCHAs, but they were designed to stop the bots and allow access to humans only. Isn’t that what you want in your survey?
  6. Password protection: Using client or server-side password protection prevents multiple completions from the same respondent, regardless of IP address. It won’t stop a bot, but they’ll only be able to do the survey once.
  7. Do a pilot study with representative participants you know are human. Sometimes, when a survey is especially long, boring, or otherwise onerous, only the bots make it all the way through! Do a pilot study (either before you start data collection or after you suspect a problem) to compare your survey data with other participants.

Once you have thoroughly cleaned the data, identifying suspected bots, and removing the poor data quality respondents, document your decision rules. (If anyone wants to replicate your study, they will need this information.) And for best results all around, communicate your findings to your panel or sample provider. It doesn’t look like we can keep the bots from winning – but by working together we can keep them under control.

