Best Practices for Panel + Big Data

By Bill Harvey In Terms of ROI Archives

A year or so ago, an e-blast from one of the new cross-platform audience analytics services referred to panels as "old fashioned" and "on their way out." The same person just recently publicly commented that panels are "very useful things to have" and that "everyone will be using panels to calibrate big data."

Over the last year it has become clear to most players in audience measurement that the ANA and VAB have stuck by the WFA mandate that big data needs to be used, but calibrated by means of a representative probability sample panel of people.

Now that the industry has lined up in a consensus agreement with this principle, it appears timely to review the best practices for carrying out this general design. This is important because not just any panel will do, and the way the two types of data are combined is a pivotal factor in maintaining the inherent stability of big data.

Any Measurement System Will Tend to Affect the Data It Collects

Heisenberg was the first (February 1927) to bring this fact to light in the context of measuring subatomic phenomena. Marketing and media researchers knew about the Hawthorne effect at about the same time (1924-1932) although the term itself was coined by Henry A. Landberger in 1958. The Hawthorne effect states that if people know they are in some sort of study, their performance improves. This is similar to John Kenneth Galbraithe’s definition of conscience as "the awful suspicion that someone is watching." (He may have been quoting H.L. Mencken.)

If one were to try to base cross-platform audience measurement solely using big data, the biases introduced into the estimates would result from a number of causes, including missing pieces of the audience and artifacts created by algorithms used to make adjustments for those missing pieces. This is because big data today are convenience samples which cover some households but not others.

Media Communication Implies People

Most of the money which makes media possible comes from advertising. Advertisers use media advertising as one of the main ways of increasing the number of customers that buy their brands, and by maintaining those people as customers by reminding them to buy the brands.

To base a measurement system solely upon device measures -– which is what all audience big data are -– automatically sets up a procedural bias that puts people somewhere in the background -- something to be inferred, an afterthought, something secondary which can easily be reliably estimated (which is not the case in reality; inferring people data accurately from device data has been proven to be supremely challenging if not impossible).

Panel Implies Opt-In

Some of the new big data audience analytics platforms call their sample a panel. This is a misuse of the term. One can have a big data sample, but it will only be a panel if the people in the sample have consciously agreed to be measured. Not opting out of device measurement does not constitute conscious agreement to be a panelist. If it’s device measurement and not people measurement, it is not a panel.

The Best Panels are Area Probability Samples of People in Households

Probability samples are the scientific method for achieving representative samples of a given universe e.g. the U.S. household population. Without a probability sample you cannot apply statistical laws and know what your sampling tolerance range is. Big data convenience samples cannot be validly treated as if total U.S. population audiences and sampling errors can be calculated, although this deters no one from pretending that you can.

Area probability samples are more inclusive and accurate than random dialing telephony based probability samples.

The Challenges of Executing Panels Properly Have Been Famously Under-Estimated

Owen Charlebois was the man who opened the door to global rollout of the PPM (portable passive peoplemeter) by first testing it in Canada, and then adopting it for the Canadian JIC, of which he was CEO. Google in its plans to mount panels in all nations brought Owen into Google, and he did a great job of setting up many panels around the world for the company. Google admitted that it thought panels would be easy and learned the opposite the hard way, before bringing in an expert. Even with Owen aboard, Google eventually backed off from panels as not being their core competency. This is the same Google that is barreling ahead in AI and yet finds audience measurement not the child’s play it seemed.

Size of the Panel Matters, and How You Calibrate

Calibration tends to cause big data to act like it has the sample size of the calibration panel. In my consulting for Nielsen (which is ongoing) it took a year working with the leadership teams to devise the best way to mitigate against stability loss caused by calibration. Nielsen has recently advanced this work by a quantum leap and the new results and the dates of currency productization will soon be announced.

Nevertheless, even with innovative new ways of calibrating big data using probability sample people panels, the calibration sample size and design will have impact, and so you want the largest possible calibration panel that you can as an industry afford. To illustrate, note that the confidence range around device data at scale might be in the range of plus or minus 2%, but if you calibrate that using a 5000 panel, depending on the way you put the data together, this range could expand to plus or minus 28%.

Response Rate Also Matters

Individual differences that relate to brand choice (crucial to advertisers) cannot be accurately predicted based on demographics and media choices (the correlations are too low). The lower the response rate, the more likely that the "correction" by weighting is not really correcting all the results properly.

The MRC finds that the highest levels of response rate achievable by a panel nowadays are around 30%. That means the Original Predesignated Homes in the panel represent about 30% of the panel. The other 70% are systematic substitutions in a properly executed and MRC-accredited scientific pattern within an area probability sample.

If you are going to build a calibration panel, budget for the number of callbacks and the value of monetary incentives needed to come as close to 30% as possible. And use the most trained team possible, because each panelist sign-on is about as hard as getting a vote.

Collect People-Level Data in Your Calibration Panel and Cover All of the Use of the Media Studied

A calibration panel ought not be a subset of the big data sample, and we say this mainly to ensure the sampling quality and collection of proprietary meter data, rather than using set top box and/or smart TV data in those homes. The reason for this is that there are errors in identifying programs, ads, and even networks in smart TV data, and set top box data are sometimes inaccurately coded for program source. The calibration panel must have an MRC-grade metering system at the people level, so that by use of the common device method (Nielsen gave it that name) the calibration panel can also bring the program crediting up to the level of watermark accuracy (the highest accuracy metering component).

The more obvious reason for calibration panels is that certain types of users may not be represented at all in a big data sample. The number one reason for using area probability calibration people panels is to keep all the fragmented segments of the media in the proper relation to one another even as they change from day to day.

Unless the people metering in the calibration panel has widely-accepted watermarks, measures linear, CTV, streaming, antenna, all devices connected to TV sets, computers, mobile, and out of home, the full benefits of calibration will not be delivered. Only a patchwork of calibration is better than none, but the purpose of this article is to urge going all the way. Otherwise, why do it in the first place?

Posted at MediaVillage through the Thought Leadership self-publishing platform.

Click the social buttons to share this story with colleagues and friends.
The opinions expressed here are the author's views and do not necessarily represent the views of MediaVillage.org/MyersBizNet.

Bill Harvey

Bill Harvey, who won an Emmy® Award in 2022 for his invention of set top box data, has spent over 35 years leading the way in media research with pioneer thinking in New Media, set top box data, optimizers, measurement standards, privacy standards, the A…