Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Startup funding shatters all records in Q1

    April 1, 2026

    StrictlyVC San Francisco is in less than a month

    April 1, 2026

    Toyota’s Woven Capital appoints new CIO and COO in push for finding the ‘future of mobility’

    April 1, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Startup funding shatters all records in Q1
    • StrictlyVC San Francisco is in less than a month
    • Toyota’s Woven Capital appoints new CIO and COO in push for finding the ‘future of mobility’
    • Mercor says it was hit by cyberattack tied to compromise of open-source LiteLLM project
    • It’s not your imagination: AI seed startups are commanding higher valuations
    • Yupp.ai shuts down after raising $33M from a16z crypto’s Chris Dixon
    • Whoop’s valuation just tripled to $10 billion
    • Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»Security»Reddit blocks the Internet Archive from crawling its data – here’s why
    Security

    Reddit blocks the Internet Archive from crawling its data – here’s why

    TechurzBy TechurzAugust 12, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Reddit blocks the Internet Archive from crawling its data - here's why
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Andriy Onufriyenko/Getty Images

    ZDNET’s key takeaways

    • The Internet Archive can now only crawl Reddit’s homepage.
    • Reddit’s goal is to block AI firms from scraping Reddit user data.
    • Publishers (and others) are suing AI companies for copyright infringement.

    Reddit is defending its privacy from AI companies that are taking roundabout approaches to scraping its content.

    The social media platform, known as a resource where users can post anonymously and find information about virtually any subject, will block the Internet Archive’s Wayback Machine from indexing its online data, according to a Monday report from The Verge. The move is in response to the discovery that AI firms, unable to scrape data from Reddit directly due to the platform’s prohibitive policies, have instead been retrieving its data from indexed content on the Internet Archive and using it to train models.

    The Wayback Machine will now only be able to scrape data from Reddit’s homepage, according to The Verge, while access to user profiles, comments, and post detail pages will be blocked.

    Launched in 1996, the Internet Archive is a non-profit that operates an enormous digital database of web content. The archive is maintained in part by the Wayback Machine, a piece of web-crawling software that gathers web pages and preserves them as they appeared when they were collected, like digital flies in amber. This serves as a resource for researchers studying the evolution of online culture and digital forensic evidence for law enforcement, among other uses.

    What Reddit’s move means

    Reddit has previously flagged concerns related to the scraping of its content with the Internet Archive, according to The Verge. The non-profit was also reportedly notified before the web-crawling restrictions started going into effect yesterday.

    The Internet Archive has yet to make an official statement about how it plans to respond to Reddit’s new restrictions, and at the time of writing, it has not responded to ZDNET’s request for comment. Wayback Machine director Mark Graham, however, has told multiple publications that the Internet Archive will “continue to have ongoing discussions about this matter” with Reddit.

    Growing tension

    Reddit’s reported decision to block Wayback Machine from scraping the majority of its content arrives during a moment of mounting tension between AI companies and digital publishers, though Reddit is the first tech company to wade into the debate. The company sued Anthropic in June after discovering that the AI company was illegally scraping its data, but it has also previously signed licensing deals with both Google and OpenAI.

    (Disclosure: Ziff Davis, ZDNET’s parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) 

    AI developers require access to gargantuan troves of information to train generative AI models, which are designed to identify and replicate subtle mathematical patterns gleaned from those training datasets.

    Many of those companies have scraped training data from publicly available websites, including social media sites and news outlets, claiming legal immunity under a concept known in copyright law as fair use. (The courts are still untangling the legitimacy of that argument, and will likely be doing so for some time.)

    Many of the organizations whose content has been copiously scraped — along with a cohort of authors and other artists — have responded with lawsuits. 

    Others, meanwhile, have signed content licensing agreements with the likes of OpenAI, Anthropic, and Google, consenting to the use of their organizations’ data in exchange for increased visibility in the responses generated by chatbots, or other benefits.

    Archive Blocks crawling data Heres Internet Reddit
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleEpic says Fortnite is coming back to iOS in Australia
    Next Article You Can’t Buy This Linux Phone Anymore
    Techurz
    • Website

    Related Posts

    Opinion

    Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles

    March 31, 2026
    Opinion

    SpaceX vets raise $50M Series A for data center links

    February 18, 2026
    Opinion

    As AI data centers hit power limits, Peak XV backs Indian startup C2i to fix the bottleneck

    February 16, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    College social app Fizz expands into grocery delivery

    September 3, 20252,288 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202516 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202512 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    College social app Fizz expands into grocery delivery

    September 3, 20252,288 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202516 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202512 Views
    Our Picks

    Startup funding shatters all records in Q1

    April 1, 2026

    StrictlyVC San Francisco is in less than a month

    April 1, 2026

    Toyota’s Woven Capital appoints new CIO and COO in push for finding the ‘future of mobility’

    April 1, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.