Are A-B Tests Misleading Market Researchers and Online Advertisers? New Research Suggests They Might

Researchers at Southern Methodist University and the University of Michigan have released a study in the Journal of Marketing that scrutinises the A-B testing of online advertisements on platforms and reveals significant flaws that may lead to incorrect conclusions regarding ad performance.

Michael Braun and Eric M. Schwartz co-authored the study. Consider a scenario involving a landscaping firm prioritising native plants and water conservation in its designs. The company crafts two distinct advertisements: one highlighting sustainability (ad A) and another emphasising aesthetics (ad B). Due to platforms personalising ad delivery, ads A and B reach different demographic groups. Users interested in outdoor activities might encounter the sustainability ad, while those keen on home decor could see the aesthetics ad. This targeted advertising approach is crucial as it seeks to present the “right” ads to the “right” users, adding significant value for advertisers.

However, Braun and Schwartz’s research indicates that the reliability of insights provided by online A-B testing on digital advertising platforms may not be as robust as marketers believe. They identify key limitations within the tools for online ad experimentation, which could lead to misleading interpretations of ad effectiveness.

The concept of “divergent delivery” is central to their critique. This phenomenon occurs when algorithms on platforms like Meta and Google target different user groups with distinct ads during A-B testing. This experimental approach is intended to compare the effectiveness of two ads. Still, issues arise when an ad performs better simply because it was shown to users more likely to respond rather than due to the ad’s content itself. Braun points out that the performance of an ad can appear variably better or worse depending on the user demographics rather than the creative quality of the ad.

The ability to target effectively is invaluable for advertisers, especially those with large target audiences and finite budgets. Major companies like Google and Meta use algorithms to distribute ads to specific users. In these platforms’ auctions, advertisers compete to display their ads to users, with winners determined not just by bid amounts but also by the relevance of the ad content to the user. However, the criteria and methodologies used by these platforms to assess ad relevance and influence auction outcomes are proprietary and not transparent to advertisers.

The implications of these findings are significant for marketers who depend on A-B testing to shape their online advertising strategies. According to Schwartz, “Because of low cost and seemingly scientific appeal, marketers use these online ad tests to develop strategies even beyond just deciding what ad to include in the next campaign. So, when platforms are not clear that these experiments are not truly randomised, it gives marketers a false sense of security about their data-driven decisions.”

The researchers argue that the problems identified are not merely technical issues with the tools but reflect a fundamental characteristic of the online advertising industry. The primary aim of these platforms is to maximise ad performance, not to deliver experimental results that marketers can independently assess. Consequently, these platforms have little incentive to help advertisers distinguish the effects of ad content from the impact of proprietary targeting algorithms. This leaves marketers in a challenging position where they either accept potentially confounded results from these tests or invest in more elaborate and expensive methods to understand the influence of creative elements in their ads genuinely.

The study employs simulation, statistical analysis, and real-world examples from A-B tests to support its argument, challenging the prevalent belief that A-B test results can be equated with those from randomised experiments. Marketers need to recognise these limitations, allowing them to make more informed decisions and avoid the pitfalls of misinterpreting data from these tests.

More information: Michael Braun et al, Where A/B Testing Goes Wrong: How Divergent Delivery Affects What Online Experiments Cannot (and Can) Tell You About How Customers Respond to Advertising, Journal of Marketing. DOI: 10.1177/0022242924127588

Journal information: Journal of Marketing Provided by American Marketing Association

Leave a Reply

Your email address will not be published. Required fields are marked *