5 minutes

A Conversation with Fairgen's Scientific Advisor, Emmanuel Candès

Published on
April 21, 2024
Written by
Fernando Zatz

Fairgen is honored to have Professor Emmanuel Candès, a distinguished figure in the fields of mathematics, as a Scientific Lead Advisor. 

Today, we are excited to share our recent interview with him, where we delve into his impressive career, his role as a strategic technical advisor for Fairgen, and his insights on the AI revolution, fairness, and governance.

Q: Can you tell us about your background and how you became interested in mathematics and statistics?

A: I became interested in applied mathematics and statistics during my undergraduate studies in France, where I mostly studied mathematics and physics. After my undergraduate studies, with a keen interest for statistics I decided to give a try to contributing something original in the sciences. I also wanted to see how people approach science outside of France. I thus joined the Stanford Ph.D. program in statistics, and spent four years learning new things that have shaped my career for the past 30 years.

Q: What does your current research focus on?

A: I have two current main research focuses related to the big data era and black-box models. The first is addressing the reproducibility crisis in science, where I develop statistical tools to avoid non-replicable discoveries, especially in genetics. The second focus is on uncertainty quantification, particularly in sensitive applications like criminal justice and finance, where I work on returning prediction ranges with prescribed validity to provide decision-makers with honest and calibrated information.

Q: What drew you to your role as a strategic technical advisor for Fairgen, and what do you hope to achieve with Fairgen?

A: Several factors drew me to this role. First, Fairgen is involved in a fascinating area where we use generative AI to augment real data, and I'm intrigued by the challenge of ensuring rigorous validation and avoiding biases in this process. Second, I had positive interactions with Samuel, CEO and Founder of Fairgen, and felt that we could work well together. Lastly, a friend of mine connected me with Fairgen, and it seemed like a promising opportunity both scientifically and from a business perspective.

Q: How does your collaboration with Fairgen take place?

A: My collaboration with Fairgen primarily involves weekly one-hour meetings with their engineers. We exchange technical documents before these meetings, which allows for productive discussions. We address technical challenges, roadblocks, and strategy. The pace of work in a startup like Fairgen is fast, and we focus on delivering value to clients, which often involves solving novel technical problems without a predefined framework.

Q: What are the main changes that have occurred in the field of machine learning and AI, particularly in relation to the surge in big data, and what are the main challenges associated with handling large datasets and deep machine learning models?

A: The main changes in machine learning and AI have been driven by the explosion in the size of available datasets and the increase in computing power. This transformation has enabled significant advancements in the field. As Peter Norvig stated, it's not just improved algorithms but the massive increase in data that has been the key driver of progress. We now have the ability to collect and analyze vast amounts of digital data, transforming how we approach science, business, and society as a whole. When applied to people, it is crucial to make sure  these algorithms are equitable. Designing rigorous audits to assess whether machine learning models apply equitably across different groups is an essential challenge. 

Q: How should the industry ensure the responsible use of AI, particularly concerning fairness and ethics?

A: Responsibility in AI involves several key considerations. First, there is no one-size-fits-all definition of fairness, so I find it important to avoid mathematically encoding a specific notion of fairness into algorithms. If people disagree on what a fair treatment is, I don’t see how we can quantify the notion of fairness.  Instead, algorithms should go through independent, rigorous evaluation processes based on metrics relevant to decision-makers. Second, it's crucial not to conflate policy-making with risk assessment. Machine learning algorithms can assess the level of risk but this fact does not define a policy. Imagine we actually solved the machine learning problem in the sense that we could make perfect predictions; then we’d still have a policy problem. Therefore, we should not position prediction algorithms as decision makers. Rather, algorithms should simply inform the formulation of policy. Third, auditing algorithms for fairness and ensuring that the audit process is well-constructed with clear guarantees can help identify and address issues. For instance, we would not want algorithms which give biased predictions for women but not for men, say. Ultimately, the responsible use of AI requires an ongoing commitment to transparency, fairness, and ethical considerations.

Q: Can you please explain the concept of control theory and its application in monitoring and correcting machine learning algorithms?

A: Control theory involves monitoring and adjusting processes to maintain desired outcomes. This framework can be applied to machine learning algorithms. What it says here is that we  should constantly monitor model performance and make corrections – e.g. update the model parameters, refit the model – when there is too much drift. For example, in predicting stock market volatility, when conditions change (e.g., a sudden market crash), the algorithm should adapt, and adaptivity can indeed be achieved by feedback mechanisms much like in control theory. 

Q: Is replication a part of the auditing process for AI models, and how can it be applied?

A: Replication can indeed be a valuable part of the auditing process for AI models. It involves comparing the model's predictions with real-world data to verify its accuracy and performance. In the context of Fairgen, replication could involve using synthetic data augmentation to predict outcomes, and then conducting real-world fieldwork to collect data and compare the predictions with actual results. This process helps ensure the model's reliability and effectiveness across various scenarios and conditions.

About Emmanuel Candès

Emmanuel Candès currently holds the position of the Barnum-Simons Chair in Mathematics and Statistics, along with serving as a Professor of Electrical Engineering (by courtesy) at Stanford University. Until 2009, he was the Ronald and Maxine Linde Professor of Applied and Computational Mathematics at the California Institute of Technology. In 1993, Emmanuel graduated from the Ecole Polytechnique with a degree in science and engineering, and received his Ph.D. in Statistics at Stanford University in 1998.

Throughout Emmanuel Candès' career, he has garnered a multitude of awards and honors, including the MacArthur Fellowship (2017), popularly known as the ‘genius award’. He is a member of the USA National Academy of Sciences and of the American Academy or Arts and and Sciences.

Learn more about his career at Emmanuel Candès’s website