Beyond 101: The Promise of Data Clean Rooms
While at the IAPP’s PSR 2023 conference, I had the pleasure of moderating a Privacy Engineering workshop focusing on Data Clean Rooms (“DCR”).
The discussion focused on DCRs promise for a more unified and privacy-protective adtech ecosystem, with:
Myles Younger, Head of Innovations & Insights at U of Digital
Tami Harrigan, Vice President, Business Development - Privacy Cloud at AppsFlyer
Pressure points
The disappearance of third-party cookies and mobile ad IDs, and Google’s arguably anticompetitive ‘Privacy Sandbox’ browser changes were top of mind for the panelists. Whether the initiative actually promotes privacy, or just puts more control in the hands of Google can be debated, but those skeptical of the Sandbox’s efficacy see the need to test DCRs as a viable complement (or alternative) to the solution.
It isn’t just about the Sandbox, however. The loss of cookies and other ‘signals’ underpinning digital marketing – particularly in mobile – spell a shift back to first-party data, an arena where the largest of media companies, agencies, and walled gardens have an endemic advantage. The result? Competitive pressure driving those with the most first-party consumer relationships to raise even higher walls.
Last but not least, in the face of mounting regulatory scrutiny on both sides of the Atlantic, the ad industry has been looking to regain regulatory and consumer trust. As Myles put it, in this environment “clean room equals table stakes”.
Clean rooms in the spotlight
In offering a range of privacy-enhancing technologies (“PETs”) together with granular data access controls, DCRs promise to securely deliver:
Informed media planning, buying and selling;
Effective addressability in cookieless (and mostly logged-in) environments;
Data modeling and enrichment using both internal and external data sources;
Trusted matching and activation of shared and similar (aka “lookalike”) audiences;
Campaign personalization and optimization; and
Sophisticated measurement, insights and analytics
In other words, common use cases that can be grouped into three categories across the marketing lifecycle: Media Planning, Campaign Execution and Analytics/Intelligence.
There is no single, unifying standard for DCRs within the adtech context yet, but panelists invited participants to think of them collectively as ‘data collaboration’ tools -- a broad category that includes ID graphs, enclaves and marketplaces, and tokenized data matching solutions among others.
According to Myles, the distinction is less important than the beneficial outcome. “if there are no real people reflected in the underlying data, then every custom query tool and reporting system is a ‘clean room’ going forward.”
AdTech x DCRs x Privacy
To meet marketers’ specific needs, DCR solutions are looking to balance a stronger desire for reliable data (i.e. first-party) while still aligning to the data minimization and other privacy first principles.
Specifically, DCRs aim to address:
Data leakage. For every authorized tag or cookie, a website may be harboring four unauthorized ones. To address unwanted (and unintentional) data disclosures, DCRs take a locked door approach -- no party can access another’s data because they cannot enter the clean room.
Security concerns. The concept is not new, but DCR platforms take full advantage of Cloud capabilities by, for example, providing dedicated hosting for each data owner -- what AppsFlyer refers to as ‘self-owned buckets’. Continuing the locked door analogy, data owners get control of the keys and their guest lists, and the
Consumer (re)identification. DCRs can leverage a range of Privacy Enhancing Technologies such as:
Differential Privacy is a data science technique that adds noise to a marketer’s customer data set before it is, for example, shared with an engagement optimization partner for analysis.
Private Set Intersection “PSI” (or Double Encryption) is a protocol that allows a smaller marketer to privately compare their data against a larger partner’s to, for example, understand shared audiences.
K-Anonymity is a ‘hiding in the crowd’ technique that groups and layers data sets to protect an individual from being singled out.
Last but not least, DCR providers are looking to democratize PETs by making matching, modeling and reporting as business-friendly and plug-and-play as possible -- no data science or software engineering degrees required.
Key use cases
With cookies or without, marketers are looking to solve increasingly complex but familiar ROI problems: ‘If we don’t have IP addresses can we still frequency-cap?’ ‘If we lose access to profileable signals, can we still engage in cross-device targeting?’ ‘Can we still pinpoint all the touchpoints that led to a purchase?’ And, ‘can we compete with walled gardens?’
Media Planning. Since DCRs are secure, access-gated environments, multiple departments, divisions or companies can bring their campaign and audience data together for multi-party analysis. Can a campaign work as well on CTV as it did on Socal? Can it drive traffic – foot or digital – cost-effectively? In this case collaboration means collective planning.
Campaign Execution. With DCRs, marketers can match their first-party data against those of participating media owners and platforms. Are shared audiences consuming content in similar ways? Are they likely to respond positively to the campaign? These insights can then be used to deliver the targeted, data-informed campaigns.
Analytics/Intelligence. The collaborative nature of DCRs make it possible for clean room participants to fill in respective blindspots in how campaigns have performed, incrementally and across touchpoints. For example, a retailer can bring sales data and a media owner campaign exposure data. In this case collaboration means collective intelligence.
Real world applications
The data clean room industry, despite having been very prevalent in the press and with heavy advertising sector adoption, is still a nascent technology. But as Myles pointed out, PETs themselves are more ingrained. For example, U.S. Census analysts relied on data suppression and aggregation to protect confidentiality as far back as 1790. And the idea of using differential privacy to protect ever larger, electronic census datasets was formalized in 1979. census confidentiality. (More here.)
With today’s technology and unprecedented access to computing power, anyone can create and experiment on their own data clean room in Amazon or Google Cloud Platform.
Those responsible for their privacy technologies might know their clean room adoption and testing strategy. But for those who don't, Tami sent them home with a call to action:
“As a privacy leader, you should be approaching your CMO, CDO, legal counsel, or whoever is overseeing the company's privacy stack, and start the conversation. Privacy leaders need to be taking charge to ensure we as an industry are operating with a privacy-by-design approach, and the best way to do that is by developing and executing a strategy internally and with our industry partners.”
Cause for optimism
For the panelists, Data Clean Rooms offer a technical path to a more privacy-protective online advertising ecosystem. To that end, DCRs can be thought of as an emerging infrastructure for data collaboration, where marketers and their technology partners can use a combination of PETs to solve a variety of business use cases.
As a privacy professional, I would be remiss in not reiterating that with DCRs it’s not the fact that such collaboration can occur – it’s that this occurs without any of the participants accessing each others’ data! This in of itself is a case for more, genuine testing of DCRs.