The Real Ways Synthetic Data Is Changing Advertising

This solution is redefining digital advertising as we know it

Data is the lifeline for digital advertising. As the digital landscape continues to change, so too will data. Whether it’s evolving data privacy regulations or third-party cookie deprecation, access to high-quality data—data that’s clean, well structured and free of bias—is a challenge for marketing organizations.

As data becomes increasingly difficult to collect, store and activate, digital marketing leaders need solutions that maintain effectiveness while protecting their customers’ privacy. Enter: synthetic data.

What is synthetic data?

Synthetic data is a class of generative AI that can optimize scarce data, mitigate bias or preserve data privacy. Synthetic data sets are artificially generated (i.e., not obtained from direct observations of the real world) to retain the statistical and behavioral aspects of real data sets without compromising the privacy of those individuals from which the data was collected.

While marketers may be tempted to dismiss synthetic data as “fake data,” it can actually be quite powerful. Synthetic data can be used to generate data sets that would otherwise be impractical because of collection limitations or regulatory restrictions, making it both available and applicable to various marketing objectives.

We will likely see a wider embrace of generative AI technologies like synthetic data in targeted advertising within the next two to five years. Marketers need to prepare for the day synthetic data will become a norm in advertising right now.

Synthetic data solves for concerns around personally identifiable information

First and foremost, synthetic data is a potentially viable solution to the common privacy challenges of sharing data across partners, including for the purpose of targeted digital advertising campaigns.

While the technology to anonymize data sets via synthetic data is still relatively immature, synthetic data in theory could protect personally identifiable information (PII) of a company’s customers such as social security numbers, phone numbers, email addresses and sensitive data like race and gender.

Today, the process behind the anonymization and sharing of real first-party data can be time-intensive and expensive. Synthetic data, on the other hand, can be created based on the original “real data” set to form a synthetic one that no longer contains any of the original PII or sensitive data information.

For instance, a large U.S. insurance company used synthetic data to anonymize complex transactional data sets that contain PII by extracting the data sets’ statistical information and complex relationships. The new synthetic data set no longer contained any original information and could not be traced back to the individuals, ensuring regulatory compliance while preserving the statistical attributes and trends of consumer behavior. The company could then share the data with third parties securely for behavioral analysis of account transactions in three days rather than six months. In advertising, this could help to create segments for specific targeted ads or building unique customer journeys.

Similar examples will only continue to emerge to protect consumer privacy: By 2025, Gartner expects synthetic data to reduce personal customer data collection in a way that avoids 70% of privacy violation sanctions.

Synthetic data helps marketers advance ad creatives

The use of AI and machine learning (ML) is on the rise in marketing, and one such technology supported by synthetic data is the deepfake, a type of synthetic media that replaces existing videos or audio with synthetically generated images or audio.

While deepfake technology has been criticized in instances of nonconsensual use, it has already been implemented in successful advertising campaigns.

Take one example from a large beauty company. It aimed to raise awareness about the harmful nature of beauty advice online by using deepfake face-mapping technology to create digital stand-ins of mothers promoting harmful beauty advice (ironically) within social media content. The result was shocked reactions from the mothers seeing their digital selves deliver such toxic advice, in turn inspiring them to have more honest and informed conversations with their children about what they’re seeing on their social media feeds. The application of deepfake technology allowed the beauty brand to deliver a genuine campaign message in a way that reduced production efforts and saved costs.

Deepfake-generated creative will become a more frequent fixture of advertising campaigns as marketers strive to keep pace with emerging tech development. In fact, by 2025, Gartner expects 30% of outbound marketing messages from large organizations will be synthetically generated, up from less than 2% in 2022.

Synthetic data enhances product testing and development

Gartner predicts that image- and video-related synthetic data will constitute more than 95% of data used for AI models by 2030. One of the most prevalent use cases for synthetic data includes training ML models to develop products and features that can raise business value by improving product quality, reducing costs and potentially uncovering new products or services in the process.

In one example, an autonomous vehicle company presents synthetically generated scenarios into its training videos that are otherwise impractical or unsafe to gather in real life—like a pedestrian walking in front of a moving vehicle. The data from these tests inform how the AI reacts and is then optimized for any number of safety features without the need to learn from real-world scenarios and putting real people in danger. Gartner predicts that synthetic data will reduce the volume of real data needed for machine learning by 70% within three years.

The future is here

Consider the following scenario: It’s 2030, and marketers for “Brand X” are preparing their campaigns for Super Bowl 64. The game, and its record-breaking commercials and sponsorships, will be broadcast live in nearly 100 million digitally connected and addressable homes.

Brand X believes in its product and is excited to debut its brand-new commercial, where, through the use of synthetic data based on its existing customer base and generative AI for video, it was able to craft thousands of versions in a fraction of the time of a traditional commercial shoot. The target homes have been segmented into one or more of hundreds of delivery groups, each receiving their own combination of video, sound and graphics designed to connect to each viewer on a key attribute of its products. Viewers love it, but the campaign throws a wrench in the typical best lists that come out the Monday after the game.

So how can marketers today leverage these practical implications and prepare for the role that synthetic data will eventually play in their strategies? Marketing leaders can start with the following imperatives:

  • Identify if synthetic data can help remain compliant with regulations in privacy and identity, as well as emerging regulations on the applications of AI in business.
  • Verify vendor claims and validate use cases. Choose vendors that generate synthetic data sets that preserve statistical attributes and behavioral trends while removing PII.
  • Assess if deepfakes can help increase the impact of ad campaigns and meet marketing goals. Consider cost and delivery timelines against traditional ad campaigns and budgets.
  • Avoid “shiny object syndrome” by partnering with leaders in IT, R&D and product teams to assess the business and customer value of synthetically generated or trained products.

Accessing the real-world data needed to train models or gain consumer insights is getting harder and harder, in part due to privacy and security concerns. But with synthetic data, marketers can circumvent collection limitations or regulatory restrictions around personal information, making it both available and applicable to various marketing and product applications.

This story is part of Adweek’s Advertising Redefined digital package, which spotlights all the ways that the industry is evolving as brands face greater challenges than ever in reaching consumers.