Best Practices for Integrating Big Data Status Report

By Bill Harvey In Terms of ROI Archives

If you’re a Nielsen subscriber, you may be wondering why Nielsen decided to postpone its Big Data Plus Panel study for use in transactions on August 29, 2022 as had been originally scheduled. This post will explain in detail. The short answer is that team Nielsen in Q2 2022, discovered imperfections in big data (BD) of a kind that had not been previously illuminated. Although Nielsen strove to create a correction system for these newly-identified flaws in time to be ready for the August 29 planned "thumbs up," it’s going to take some months longer. Nielsen reached that decision about three weeks prior to the planned approval date and did the right thing by delaying for a finite period.

Some time ago, Nielsen asked Marty Frankel and myself to help design the original and optimal integration of panel and Big Data (BD). This is largely because of Marty’s extensive history of advising Nielsen and the Media Rating Council (MRC) in all matters statistical, and me because of my long history of development and refinement in the TV BD space. Nielsen thinks of us as two "old" pros accustomed to working closely together to achieve success. To date, clients seem to be patient with the timing of the pushback and remain confident. A series of special meetings initiated by Nielsen with MRC and its TV Committee since April 2022, have been keeping clients advised of the research work under the guidance of Marty and me on behalf of Nielsen.

The Nielsen people who deserve all of the credit for the methodological research findings and correctives include David Kurzynski, Meghan Beeman, Yong Flannagan, Cermet Ream, Samantha Mowrer, Bala Shankar, Noel Thomas, with guidance from Mainak Mazumdar, Scott Brown, Jonathon Wells, Kimberly Gilberti, Marty Frankel and myself. These fine folks also helped write this article.

With the foregoing as the precis, now here is the more complete synopsis:

The integration of panel + BD which Nielsen has put in place is called the Impact study, meaning that its reporting is intended to provide clients with insight into the impact the BD has by comparing it to the panel data, which continues to be used as the main currency of the television business until such date as Nielsen considers the two sources are both valid and reliable enough on which to base monetary transactions. Once that moment in time has been reached, each client will be able to decide which to use as its currency (panel only or panel + BD).
In September 2021, a widely-distributed report showed the industry that the Impact data was on average within about 1% of panel-only data. For the most part the panel + BD was very slightly higher than panel only, except for adults 25-54 where panel + BD was 1.2% lower than panel only. The latter was somewhat concerning, and the other concern was that the BD had not stabilized the report-to-report fluctuations as much as it had been assumed they would. For these reasons, Nielsen and its clients decided to postpone the classification of the panel + BD as currency and to continue the work to resolve these remaining concerns. This led to Marty and my being engaged and our current work with Nielsen began in late 2021.

Having been awarded an Emmy for the "pioneering development" of privacy-protected set top box (STB) data, and having also worked with smart TV data, I had a "going-in" suspicion that the smart TV data was worth studying most intensively. Before commencing with the Nielsen engagement, these were the lookouts I had about smart TV (STV) data:

A typical user of STV data is gathering data from only about 1.1 TV sets per average household, while potentially missing the other 1.7 TV sets in the average household. This is not true of STB, which tends to cover virtually every TV set (with minimal exceptions) in a measured household.
STV does not measure all networks and all stations.
STV is not "householded" (2+ TV sets that are in the same household are not identified as such).
Contrary to popular belief, streaming is not measured by STV. The fingerprinting system shuts down when native apps are used. (Most STB also does not measure streaming, though Nielsen panel data does measure streaming properly.)
STV tends to credit tune-in ads as short tunings to the actual programs being advertised.

Nielsen had already discovered and was aware of most of these problems and had developed a corrective system for the first three listed above, which is already dimensioned in the body of Impact data.

Because of my suspicions we led the team to deeply analyze the STV data most intensively, but also applied all the same analytics to set top box (STB) data. As of April, 2022 we reported to the MRC TV Committee that we had found another STV problem: a higher percentage of total tuning as compared to the Nielsen panel was being left as uncredited by STV and could only be included as AOT (All Other Television). We found that some of that extra AOT was actual tuning to broadcast and cable that the Nielsen panel had credited to specific networks and stations but that STV was unable to identify via its content recognition technology. This was immediately recognized as the probable cause of the 1.2% shortfall in the Impact data among adults 25-54.

This seemed as if it was going to be easy to fix, by creating another corrective system like the one Nielsen had already created for STV-unmonitored networks and stations, except this one would be for STV-monitored networks and stations. We began to proceed along those lines, developing such a corrective system. However, we also continued to analyze the STV and STB BD as deeply as possible.

While this work was going on, simultaneously we were also working on another track, aimed at increasing the report-to-report stability. The idea was to project the BD beyond its own footprint. As described by my earlier article in this series, both MRC and Nielsen leaned toward using BD to represent the homes providing the BD, but not projecting from them to households not providing their own BD. MRC does not prohibit projection beyond footprint, but requires that such projection must be justified by methodological research proof of its validity. So our team conceptualized a couple of dozen different scenarios involving the four BD providers to Nielsen at the present time. The previous article reports more detail on this process.

Based on the similarities and differences from one slice of the population to another, we identified STV homes that are internet connected and are not BD satellite homes as being the one type of US home where the current providers had the most similar viewing to the total population of such homes. The projection beyond footprint for that cohort would increase the weight of BD in the Impact data from its present 25%, to 66%, presumably adding significant stability from report to report.

However, this work could not be carried out until the STV "unidentified crediting" problem was fixed, because giving STV data more weight would increase its deflationary effect. So projection beyond footprint was paused while we worked feverishly on STV. That’s when we discovered that there was more to the problem than we had realized.

One of the most powerful tools Nielsen has, is the Common Homes (CH) Methodology. Because Nielsen has a meter panel based on an area probability sample with the highest known response rate of any commercial panel or survey in the world, the meter data (repeatedly accredited by the MRC) can be used as a standard of relative truth, against which to compare BD in the exact same home as the meter. The technique goes down much deeper: it's not only Common Homes, it’s also Common Devices (CD) and Common Tuning Minutes (CM). In other words, if measuring a specific TV set at a given minute yields a credit to network X, what does the BD credit?

That’s when we discovered that the AOT phenomenon is not the whole of the problem, rather only one symptom of it: STV crediting deviates from meter more than STB does. Why is this? At first, we thought it might be related to the fingerprinting being confused by ads, many of which do not appear in STV reference libraries, and so searching for an impossible match might sometimes cause a forced match to a wrong network. This turns out to be true sometimes (miscrediting twice as likely during commercial minutes) but not enough to account for all the errors.

Another factor has nothing to do with fingerprinting: sometimes an STV provider will utilize a secondary algorithm involving attribution based on network affiliate and the smart TV’s geolocation, to be used when fingerprinting results in an uncertain match; however, this can also contribute to crediting the wrong affiliate. Again, this type of error by itself cannot account for all of the errors.

Using the Common Devices method, Nielsen is able to look at the same tuning minute in the same smart TV as measured by the Nielsen meter and as measured by the big data provider for that smart TV. For Live tuning, leaving out the effects of unmonitored program sources, and leaving out the uncredited STV tuning which goes into AOT, where both the meter and the smart TV BD have credited a specific program source, the smart TV BD crediting agrees with the Nielsen meter crediting for that same TV set 77% of the time. By comparison, when the same method is applied using set top box data, the agreement rate is 88%.

We are still in the midst of pinning down the size of each of the error types, and are currently investigating a type of error which looks like it is pervasive enough, with the other types, to fully account for all the errors. Apparently, there is an interaction effect between satellite and smart TV which is capable of confounding fingerprinting in cases where there are momentary losses in signal strength, typically associated with moisture (rain, snow, etc.) and other atmospheric/astronomical conditions.

As Nielsen Technology Senior Vice President Scott Brown explains, "Satellite television has the link of the transmission to the dish, to the decoder box in the home, that is subject to weather issues. In effect, heavy rain or snow can impede the signal during the weather phenomena and the screen goes black. The other RPD players do not have this weakness of transmission. As we roll out our advanced StreamFP – a new Gracenote granular signature method which will be employed for Nielsen One - our signatures become more granular and we might do better in identifying small fragments of exposure during signal strength drops. We should have data forensics to help us determine more via granularity reveal."

Nielsen clients have expressed the desire that major methodological improvements be made only at the start of a new season. It looks like by the start of next season the Nielsen methodology for integration of Big Data with panel will have been thoroughly tested and proven to be the most valid and stable that is scientifically achievable. The ongoing tracking of the non-currency Impact data during the current broadcast season will show a convergence of panel-only data and panel plus Big Data as the current season wears on, as a result of our ongoing work to ferret out each source of error we can find, and to determine how to best project the corrected data so as to achieve the stability that the industry needs.

Click the social buttons to share this story with colleagues and friends.
The opinions expressed here are the author's views and do not necessarily represent the views of MediaVillage.com/MyersBizNet.

Bill Harvey

Bill Harvey, who won an Emmy® Award in 2022 for his invention of set top box data, has spent over 35 years leading the way in media research with pioneer thinking in New Media, set top box data, optimizers, measurement standards, privacy standards, the A…