Increasingly, the use of synthetic data—artificially generated data that replicates real-world experiences—is being used to train AI models. But relying too heavily on synthetic data can lead to fundamental flaws. Womble Bond Dickison Partner Chris Mammen recently discussed these risks with Lexology, saying, “One illustration shows that if a data set with images of dogs is re-trained on AI outputs, or synthetic data, the most common images—say, golden retrievers—will gradually become over-represented in the data, until all of the outputs are golden retrievers. Then model starts to lose track.”

/Passle/678034865f458907b06ca7a9/SearchServiceImages/2026-02-23-17-00-51-777-699c87c3994bc9b52c42837e.jpg)
/Passle/678034865f458907b06ca7a9/SearchServiceImages/2026-02-12-20-15-56-068-698e34fcaf5bf7dc43ba3676.jpg)
/Passle/678034865f458907b06ca7a9/SearchServiceImages/2025-12-01-14-44-24-571-692da9c801280cd27d753c8c.jpg)