Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Didero lands $30M to put manufacturing procurement on ‘agentic’ autopilot

    February 12, 2026

    Eclipse backs all-EV marketplace Ever in $31M funding round

    February 12, 2026

    Complyance raises $20M to help companies manage risk and compliance

    February 12, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Didero lands $30M to put manufacturing procurement on ‘agentic’ autopilot
    • Eclipse backs all-EV marketplace Ever in $31M funding round
    • Complyance raises $20M to help companies manage risk and compliance
    • Meridian raises $17 million to remake the agentic spreadsheet
    • 2026 Joseph C. Belden Innovation Award nominations are open
    • AI inference startup Modal Labs in talks to raise at $2.5B valuation, sources say
    • Who will own your company’s AI layer? Glean’s CEO explains
    • How to get into a16z’s super-competitive Speedrun startup accelerator program
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»Startups»Synthetic data is the new AI gold rush, but critics call it ‘data laundering’
    Startups

    Synthetic data is the new AI gold rush, but critics call it ‘data laundering’

    TechurzBy TechurzAugust 14, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    PluggedIn Newsletter logo
    Share
    Facebook Twitter LinkedIn Pinterest Email


    AI development is moving at a rapid pace, but it risks running headlong into a wall. As websites increasingly place barriers on scraping (some of which are allegedly ignored), and as the remaining content is voraciously collected by scrapers to train AI models, concerns are growing that we may run out of usable training data.

    The industry’s answer? Synthetic data.

    “Recently in the industry, synthetic data has been talked about a lot,” said Sebastien Bubeck, a member of technical staff at OpenAI, in the company’s livestreamed release of GPT-5 last week. Bubeck stressed its importance for the future of AI models—an idea echoed by his boss, Sam Altman, who live-tweeted the event, saying he was “excited for much more to come.”

    The prospect of relying heavily on synthetic data hasn’t gone unnoticed by the creative industries. “I believe the main reason companies like OpenAI are having to rely more on synthetic data now is that they’ve run out of high-quality human created data to mine from the public facing internet,” says Reid Southern, a film concept artist and illustrator.

    Southern believes there’s another motive. “It further distances them from any copyrighted materials they’ve trained on that could land them in hot water.”

    For this reason, he has publicly called the practice “data laundering.” He argues that AI companies could train their models on copyrighted works, generate AI variations, then remove the originals from their datasets. They could then “claim their training set is ‘ethical’ because it didn’t technically train on the original image by their logic,” says Southern. “That’s why we call it data laundering, because in a sense, they’re attempting to clean the data and strip it of its copyright.” (OpenAI did not respond to Fast Company’s request for comment.)

    The issue is more nuanced, according to Felix Simon, an AI researcher at the University of Oxford. “In one sense, it doesn’t really remediate the original harm over which creators and AI firms squabble,” he says. “After all, synthetic data isn’t plucked from the ether but presumably created with models that have reportedly been trained with data from creators and copyright holders—often without their permission and without compensation.” From the perspective of societal justice, rights, and duties, “these rights holders still are owed something even with the use of synthetic data—be that compensation, acknowledgements, or both.”

    Ed Newton-Rex, founder of Fairly Trained—a non-profit certifying AI companies that respect creators’ intellectual property rights—shares Southern’s concerns. “I think synthetic data is a legitimately helpful way to augment your dataset,” he says. “If you’re training an AI model, it’s a way of increasing the coverage of your training data. And at a time when we’re butting up against the limits of legitimately accessible training data, it’s seen as a way to extend the usable life of that data.”

    Still, Newton-Rex acknowledges its darker side. “At the same time, I think unfortunately its effect is, at least in part, one of copyright laundering,” he says. “I think both are true.”

    He warns against taking AI firms’ promises at face value. “Synthetic data is not a panacea from the incredibly important copyright questions,” he says. “I think there tends to be so much of a feeling that synthetic data helps you, as an AI developer, get around copyright concerns.” That belief, he says, is wrong.

    The framing of synthetic data—and the way AI companies talk about model training—also helps them distance themselves from the individuals whose work they may be using. “The average listener, if they hear this model was trained on synthetic data, they’re bound to think, ‘Oh, right, okay. Well, this probably isn’t Ed Sheeran’s latest album, right?’ It further moves us away from an easy understanding of how these models are actually made, which is ultimately by exploiting people’s life’s work.”

    He compares it to plastic recycling, where a recycled container might once have been a toy, a car bumper, or something else entirely. “The fact these AI models mash all this stuff up and generate, quote-unquote, ‘new output’, does nothing to reduce their reliance on the original work.”

    For Newton-Rex, this is the critical takeaway: “Really the absolutely critical element here, and it’s just got to be remembered, is that even in a world of synthetic data, what’s happening is people’s work is being exploited in order to compete with them.”

    The early-rate deadline for Fast Company’s Most Innovative Companies Awards is Friday, September 5, at 11:59 p.m. PT. Apply today.

    call critics data gold laundering rush Synthetic
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleHow to Use Zoom’s AI Meeting Summary
    Next Article FIDO ausgehebelt | CSO Online
    Techurz
    • Website

    Related Posts

    Opinion

    Ex-Googlers are building infrastructure to help companies understand their video data

    February 9, 2026
    Opinion

    Fundamental raises $255 million Series A with a new take on big data analysis

    February 5, 2026
    Opinion

    AI data labeler Handshake buys Cleanlab, an acquisition target of multiple others

    January 28, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    College social app Fizz expands into grocery delivery

    September 3, 20251,514 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202514 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202511 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    College social app Fizz expands into grocery delivery

    September 3, 20251,514 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202514 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202511 Views
    Our Picks

    Didero lands $30M to put manufacturing procurement on ‘agentic’ autopilot

    February 12, 2026

    Eclipse backs all-EV marketplace Ever in $31M funding round

    February 12, 2026

    Complyance raises $20M to help companies manage risk and compliance

    February 12, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.