This browser is not actively supported anymore. For the best passle experience, we strongly recommend you upgrade your browser.
| less than a minute read

Chris Mammen Talks Synthetic Data Risks in AI Training

Increasingly, the use of synthetic data—artificially generated data that replicates real-world experiences—is being used to train AI models. But relying too heavily on synthetic data can lead to fundamental flaws. Womble Bond Dickison Partner Chris Mammen recently discussed these risks with Lexology, saying, “One illustration shows that if a data set with images of dogs is re-trained on AI outputs, or synthetic data, the most common images—say, golden retrievers—will gradually become over-represented in the data, until all of the outputs are golden retrievers. Then model starts to lose track.”


 

One illustration shows that if a data set with images of dogs is re-trained on AI outputs, or synthetic data, the most common images - say golden retrievers - will gradually become over-represented in the data, until all of the outputs are golden retrievers. Then the model starts to lose track.

Tags

san francisco, ai and machine learning, artificial intelligence
chevron-up