Explainer: Synthetic Data Privacy 

A new form of technology has hit the data governance market—synthetic data generation. 

Adweek states synthetic data is used “to retain the statistical and behavioral aspects of real data sets without compromising the privacy of those individuals from which the data was collected.” Likewise, this data can replicate that “real-world data” that “would otherwise be impractical because of collection limitations or regulatory restrictions.” 

In simpler terms, synthetic data analyzes datasets, then produces statistical models to guide decision making—all without humans seeing the actual data.  

Benefits of Synthetic Data Generation: 

Synthetic data is useful to organizations for a variety of reasons. 

Personally Identifiable Information

Synthetic data “solves” for concerns surrounding personally identifiable information (PII) because this data can be shared across business partners without concerns that an individual’s actual PII will be exposed. The synthetic data is based on “real data” obtained from an individual, but scrubbed of any original PII or sensitive data.  

Gartner, the management consulting company who produced the Adweek article, predicted that the use of synthetic data would “reduce personal customer data collection in a way that avoids 70 percent of privacy violation sanctions.” 

Marketing and Advertising 

The use of AI and machine learning (ML), when paired with synthetic data generation techniques, is considered a “deepfake technology,” which is “a type of synthetic media that replaces existing videos or audio with synthetically generated images or audio.” 

The use of deepfake technology has received some criticism according to Adweek, but it has already been successfully implemented in advertising and marketing campaigns and they anticipate that it “will become a more frequent fixture of advertising campaigns as marketers strive to keep pace with emerging tech development.” In fact, Gartner expects that by 2025, “30% of outbound marketing messages from large organizations will be synthetically generated, up from less than 2% in 2022.” 

Product Testing and Development 

Image and video related synthetically generated data is expected to “constitute more than 95% of data used for AI models by 2030.” One of the prevailing examples of that usage type is in “training ML models to develop products and features that can raise business value by improving product quality, reducing costs and potentially uncovering new products or services in the process.” 

Growth in the Market: 

Adweek anticipates “a wider embrace” of synthetic data generation “within the next two to five years,” and predicts that it will become the “norm” in the marketing industry. To read more about the projected growth of this technique, review The Synthetic Data Generation Market Research Report published by the Digital Journal, which provides an analysis of how synthetic data generation fits into the existing market landscape and provides a projection of the expected growth prospects before 2030—much of which can be attributed to the Covid-19 pandemic. 

If your organization is interested in adopting this technique, you can begin analyzing the data in your information system and its usage in your business model to identify if this synthesizing process would be of use. 

After you have determined if this technique is right for you, review synthetic data generation vendors and select those vendors that generate data sets that match your organization’s objectives and existing “real life” data sets. 

As Digital Journal points out, this synthetic data generation market industry "is intensely competitive and fragmented because of the presence of several established players participating in various marketing strategies to expand their market share.” In this competitive market, the competition amongst vendors is “centered on price, quality, brand, product differentiation, and product portfolio.” Two emerging options to consider are Gretel and Genalog

Previous
Previous

Explainer: Iowa Data Privacy Bill

Next
Next

SEC Announces Proposed Amendments to Regulation S-P