2 minutes read

The Promise of Synthetic Data: A Breakthrough in Market Research Data Collection

Published on

October 6, 2024

Written by

Noa Kalmanovich

The Promise of Synthetic Data: A Breakthrough in Market Research Data Collection

Table of contents

TOC Link

Presented at ESOMAR’s annual Congress conference in Athens, Greece, our collaboration with the Ifop Group was exhibited in the white paper titled “Synthetic Data in Marketing Studies: Exploring the promise of generative AI and synthetic data.” To address a key industry challenge in data collection processes, we worked with Thomas Duhard, Head of Data Projects at Ifop, to push the boundaries of AI and understand the potential of synthetic samples for our industry’s search for insights.

Below is a brief overview of the content, but you can download the full paper and watch our recent presentation from the ESOMAR Congress event earlier this month.

What are we looking to achieve with synthetic samples?

Standard data collection practices often struggle to balance fundamental economic and technical factors, such as assuring representativeness, achieving sufficient sample sizes, and maintaining data quality. By leveraging augmented respondents, we provide a straightforward solution to this problem by narrowing the scope and boosting real data with AI-generated synthetic sample boosters.

Promises of synthetic data in market research

Understanding boost factors with the most extensive industry benchmarks

In the paper, the authors demonstrate the effectiveness of synthetic sample boosters through over 7,000 parallel tests using datasets from the Pew Research Center to compare real boosts to AI-generated boosts, illustrating how it can improve samples of low-incidence populations that are often hard to analyze.

The paper then explains the methodology behind the calculation of Effective Sample Sizes (ESS) and boost factors, concluding that, on average, Fairgen is as reliable as three times the amount of real data on the sub-segment level.

Statistical validation behind synthetic sample boosts

Qualitative benchmarking while boosting swing groups in the European election of 2024

The paper then showcases a study on the European elections; Ifop augmented a key swing group of secondary school teachers using our synthetic boosts. The political poll included a representative sample of 8,000 French adults, with only 116 respondents from the teacher demographic. By employing augmented synthetic respondents, this group was boosted to 580 respondents, correcting inconsistencies and aligning the sample with sociological plausibility, ultimately providing a better read into this rare demographic’s influence on the election's outcome. Moreover, the results showed that AI can reliably mimic human responses, enhancing the representation of niche groups.

What’s next for synthetic samples?

While the industry benefits from the economic gain and flexibility offered by augmented synthetic respondents, the paper highlights several key concerns surrounding synthetic data:

What are the limitations of synthetic data?
Does synthetic data pose a reliability risk?
Is synthetic data the latest breakthrough in data collection?

Samuel and Thomas address these challenges and propose responsible deployment strategies to set a standard for ethical and effective use of augmented synthetic samples.

In conclusion, while augmenting real data promises significant benefits in terms of delivering unprecedented granular insights, it is essential to operate within the technology’s limitations. Careful deployment is vital for maintaining data quality and preventing misuse.

Through this collaboration, Fairgen and IFOP demonstrate that synthetic data is a powerful and viable tool for modern quantitative research. By acknowledging its limitations and maximizing its potential, synthetic data can drive granular recommendations and propel the industry forward.

‍

Access the full paper and watch our talk from the ESOMAR Congress event here.

Presented at ESOMAR’s annual Congress conference in Athens, Greece, our collaboration with the Ifop Group was exhibited in the white paper titled “Synthetic Data in Marketing Studies: Exploring the promise of generative AI and synthetic data.” To address a key industry challenge in data collection processes, we worked with Thomas Duhard, Head of Data Projects at Ifop, to push the boundaries of AI and understand the potential of synthetic samples for our industry’s search for insights.

Below is a brief overview of the content, but you can download the full paper and watch our recent presentation from the ESOMAR Congress event earlier this month.

What are we looking to achieve with synthetic samples?

Standard data collection practices often struggle to balance fundamental economic and technical factors, such as assuring representativeness, achieving sufficient sample sizes, and maintaining data quality. By leveraging augmented respondents, we provide a straightforward solution to this problem by narrowing the scope and boosting real data with AI-generated synthetic sample boosters.

Understanding boost factors with the most extensive industry benchmarks

In the paper, the authors demonstrate the effectiveness of synthetic sample boosters through over 7,000 parallel tests using datasets from the Pew Research Center to compare real boosts to AI-generated boosts, illustrating how it can improve samples of low-incidence populations that are often hard to analyze.

The paper then explains the methodology behind the calculation of Effective Sample Sizes (ESS) and boost factors, concluding that, on average, Fairgen is as reliable as three times the amount of real data on the sub-segment level.

Qualitative benchmarking while boosting swing groups in the European election of 2024

The paper then showcases a study on the European elections; Ifop augmented a key swing group of secondary school teachers using our synthetic boosts. The political poll included a representative sample of 8,000 French adults, with only 116 respondents from the teacher demographic. By employing augmented synthetic respondents, this group was boosted to 580 respondents, correcting inconsistencies and aligning the sample with sociological plausibility, ultimately providing a better read into this rare demographic’s influence on the election's outcome. Moreover, the results showed that AI can reliably mimic human responses, enhancing the representation of niche groups.

What’s next for synthetic samples?

While the industry benefits from the economic gain and flexibility offered by augmented synthetic respondents, the paper highlights several key concerns surrounding synthetic data:

What are the limitations of synthetic data?
Does synthetic data pose a reliability risk?
Is synthetic data the latest breakthrough in data collection?

Samuel and Thomas address these challenges and propose responsible deployment strategies to set a standard for ethical and effective use of augmented synthetic samples.

In conclusion, while augmenting real data promises significant benefits in terms of delivering unprecedented granular insights, it is essential to operate within the technology’s limitations. Careful deployment is vital for maintaining data quality and preventing misuse.

Through this collaboration, Fairgen and IFOP demonstrate that synthetic data is a powerful and viable tool for modern quantitative research. By acknowledging its limitations and maximizing its potential, synthetic data can drive granular recommendations and propel the industry forward.

‍

Access the full paper and watch our talk from the ESOMAR Congress event here.

Learn more about Fairgen

Technical

January 14, 2026

When Synthetic Data Works (And When It Doesn't): An Independent Validation

Technical

April 16, 2024

The Promise of Synthetic Data: A Breakthrough in Market Research Data Collection

What are we looking to achieve with synthetic samples?

Understanding boost factors with the most extensive industry benchmarks

Qualitative benchmarking while boosting swing groups in the European election of 2024

What’s next for synthetic samples?

What are we looking to achieve with synthetic samples?

Understanding boost factors with the most extensive industry benchmarks

Qualitative benchmarking while boosting swing groups in the European election of 2024

What’s next for synthetic samples?

Learn more about Fairgen

When Synthetic Data Works (And When It Doesn't): An Independent Validation

The Synthetic Revolution in Research: How Fairgen's Methodology Enables Reliable Granular Insights

Research without limits