As a market research software provider, we often share insights from our own perspective on the impact of synthetic data. We enjoy guiding you through our view of the evolving landscape of market research. This time, however, we want to shift focus: to explore how brands themselves are deploying and experimenting with synthetic data and digital twins. Generative AI has granted the ability to innovate flexibly in personalization, forecasting, and product development– while maintaining data security– offering brands a potent tool.

Quick refresh of synthetic data

We know we might sound like a broken record defining synthetic data, but let’s quickly revisit. Unlike “real data” synthetic data is data generated using algorithms or mathematical models, often on the basis of real-world samples or information. Generated by analyzing patterns and correlations, this artificially generated data replicates the essential characteristics of real data. Used across industries, it supports processes from fraud detection in banking and policy development in healthcare to campaign and product testing with synthetic respondents in market research.

Why does synthetic data matter for brands?

It's undeniable that the surge in AI tool integration is making waves. Capgemini Research Institute’s latest report notes that organizations are investing more than 60% of their marketing technology budget towards Generative AI, amplifying the prevailing rush to maintain breadth in innovation. As professionals are increasingly turning to synthetic data, they’re discovering powerful ways to bolster internal operations and customer-facing strategies. For brands, this shift not only streamlines processes, but also strengthens smarter decision-making, sharper insights, and more personalized customer experiences. They now have access to a fast and cost-effective way to experiment within their market, gaining deeper insights that more efficiently align with and support their goals.

How brands are embracing digital twins

As the name states, twin systems, put simply, are digital doppelgängers. Whether it’s a person, object, or system generated virtually, these replicas augment understanding of consumer behavior as if observing in the real world. Often leveraging real data sources or at times updating in real-time to capture the asset’s dynamic nature, digital twins create an immersive environment for organizations to explore and experiment. By connecting scenario planning with internal functions, business decisions can be assessed within a digital environment.

According to McKinsey’s analysis, nearly 75% of companies have integrated digital twins technologies into complex aspects of their operations. On top of that, the global market for this technology is projected to grow at an annual rate of about 60% over the next five years, reaching $73.5 billion by 2027. AI is set to rethink marketing and brand journeys.

Streamlining manufacturing and product development

The use of digital twins creates risk-free environments for product development, enabling teams to explore design options, analyze product behavior, and monitor interactions. This approach not only streamlines the design process but also facilitates the identification and evaluation of potential changes, ultimately driving innovation and efficiency in production. Projects that were hindered by lengthy privacy compliance processes are now easily deployed with these replicas.

Demonstrated by the collaboration between Siemens and TrakRap, a packaging solutions manufacturer, the use of digital twin technology proved invaluable in the development and optimization of a packing machine prototype. By simulating finite elements, materials, and control it all within a digital environment, they were able to design a process to the highest potential, evaluate suitable configurations, and significantly reduce costs. This innovative approach enabled them to achieve sustainable advancements in the industry.

Digital twin replicates real-world systems

Building go-to-market plans

Evidenza.ai is an AI-driven platform that creates a diverse array of synthetic customer personas, each based on unique personal and professional attributes tailored to product categories. These persona profiles enable researchers to effectively target and analyze distinct groups, augmenting brand positioning, go-to-market strategies, and competitive insights. By using synthetic research, the platform offers valuable, data-driven intelligence that resonates with C-suite executives, providing them with actionable insights for decision-making.

Market strategies shaped through insights from synthetic ‘ideal customer’ profiles

Building immersive customer experiences

Metaverse-style digital twins for immersive experiences and operations replicate physical environments within virtual spaces. Casting lifelike simulations, brands are able to interact with these systems for experiential purposes such as user engagement, training, and real-time insights within these virtual environments. Catching it in action, we can refer to the digital twin implementation in smart stadiums. The SoFi Stadium, located in the United States, has been developed with such technology to amplify one of the world’s favorite pastimes– sports and entertainment. Designed with “modernizing the fan experience in mind,” the SoFi encompasses data-driven facets that record real-time insights to optimize operations, offer structural support, and sync performance, all to identify opportunities to improve the experience for fans.

SoFi Stadium is replicated within a virtual realm for experience and operational strategies

Predicting future trends with synthetic data

Exemplified by alcohol giant, Diageo, LLM algorithms were used to predict the future of the supply chain. Teaming up with Ai Palette, Diageo tracked emerging trends within their beverages by “scanning everything from social media to news article mentions and restaurant and bar menus, to determine which flavours are growing in popularity on either a national, regional or global level.” The comprehensive data generated by these models and networks established a new product development framework for the brand, resulting in more informed, data-driven product launches.

AI is harnessed to inform product launches

Generative AI for privacy preservation

In a time where data breaches and privacy concerns are rife, ensuring data is protected is invaluable. As with any online tool, there is a significant risk of sensitive data being leaked or exposed to malicious actors. However, there are effective solutions that can mitigate these risks and are essential to implement.

Synthetic data on the one hand offers a platform and artificial points to test “ad campaigns or conduct product testing on synthetic populations to reduce risk and refine go-to-market strategies” without privacy concerns. On the other hand, often the LLM models used to train and implement these operations may use 3rd party APIs, RAG, few shot prompting, and fine tuning that may leave data exposed. To deflate these privacy challenges, solutions such as Tonic Textual by Tonic.ai are brought into play.

Tonic Textual is a “text de-identification tool that uses state of the art proprietary named-entity recognition (NER) models to identify personally identifiable information (PII) in text.” In other words, this AI algorithm analyzes text data to identify information that is unique to individuals or deemed private, removing key identifying features that could lead to data leakage. Tonic Textual is programmed to clean brand’s data in two ways:

  • Redaction: Personally identifiable information is directly extracted and replaced with placeholder data.
  • Synthesis: Synthetic, non-sensitive data is inserted in place of personally identifiable information to retain privacy.

Incorporating data security applications into preprocessing pipelines is essential for brands to fully leverage the benefits of generative AI in a compliant way. Prioritizing this solution not only safeguards individual privacy but also keeps an organization compliant with changing regulations, paving the way for a safer digital landscape. Plus, it once again utilizes the power of synthetic data to fill in where real world methods lack!

Personally identifiable information is redacted and replaced with synthetic data

AI-augmented data

At Fairgen, while we support generative AI in forms similar to digital systems, we take a distinct route when it comes to synthetic data. Using augmented synthetic research is a powerful way to analyze niche markets. Fairgen boosts survey reliability by generating synthetic data that accurately reflects under-sampled groups. This allows researchers to capture insights on niche customer segments that may be challenging to reach through traditional methods, enabling more precise, data-driven strategies tailored to these unique audiences. Fairgen’s tools augment representation, ensuring that smaller or specialized market segments may be thoroughly understood. While AI-driven data modeling marks a breakthrough in redefining market research through generative AI, this approach enables predictive insights to be created through highly relevant data models with unprecedented depth and flexibility.

Hybrid data models

Researchers, brands, and organizations at large are rapidly taking on the transformative era of systems powered by synthetic data and the capabilities of generative AI models. Digital twins, as dynamic platforms for consumer and product research, offer immense potential for analysis and simulation. However, it is important to recognize the inherent risks, including unintended biases, lack of realism, and accuracy issues within AI systems.

As such, we advocate for a blended approach, where “data is used to augment, rather than replace human-based data gathered from real-world observations.” To explore this balanced perspective further, read our blog post, Transforming Market Research Operations Through Human and Machine Synergy.

Top Synthetic Data Providers to Watch

  • Fairgen: Tapping into quantitative studies, Fairgen offers unprecedented granularity with augmented synthetic respondent. Deep machine learning models are trained on uploaded datasets, boosting niches of interest with predictive respondents, ultimately tripling the segment sample size.
  • Yabble: Powered by proprietary, augmented LLMs, Yabble’s solutions offer a comprehensive suite of AI tools that streamline data analysis. Their innovative "Virtual Audiences" feature enables instant insights, eliminating the need for traditional fieldwork. This approach quickly delivers actionable market trends, brand perceptions, concept testing, and audience segmentation, making it ideal for brands seeking rapid, in-depth consumer insights.
  • OpinioAI: Utilizing LLMs, OpinioAI offers the capacity to conduct in-depth one-on-one interviews or large scale synthetic research, helping brands generate insights from existing data like surveys, transcripts, and reports. Their platform enables users to create synthetic personas that reflect target market segments, analyze core brand positioning, and assess value propositions.
  • Native AI: Native AI is a consumer analytics platform powered by AI and natural language processing (NLP) to analyze and predict customer preferences and behaviors. Through generative AI, it offers actionable recommendations by examining both qualitative and quantitative data, presenting insights in dashboards with visualizations, automated product tracking, and in-depth reporting. Their proprietary "Digital Twins" feature creates interactive, AI-driven personas that emulate target customers, enabling users to gather predictive insights on product preferences and competitive trends.
  • Synthetic Users: Synthetic Users is an AI-driven platform, powered by large language models, designed to test products and concepts with lifelike, AI-generated personas. Its conversational interface allows companies to conduct dynamic surveys and simulate realistic scenarios to efficiently validate market behaviors, assess product-market fit, and gather actionable feedback from diverse synthetic user profiles.
  • Zappi: Fusing the power of consumer data and AI, Zappi’s software delivers insights that allow users to explore and test systems to understand consumer feedback. Additionally, they help streamline product development processes by incorporating AI-driven data into operations that screen, refine, and vet concepts.
  • Evidenza: Specializing in synthetic data for real time qualitative and quantitative research, Evidenza offers virtual personas specifically within the B2B sector. Employing AI to create synthetic audiences for on-demand surveys and interviews, allows users to choose whether they want research design, data generation, analysis, and reported to be full managed by the solution or a DIY SaaS model for a more integrative, hands-on approach.
  • Roundtable: Roundtable is an AI-powered survey analysis platform that uses synthetic data to simulate audience responses. Its Alias API enables targeted segmentation and analysis, while automated bot detection tools streamline data cleaning and fraud prevention.