Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Electric aircraft startup Beta Technologies seeks to raise $825M in IPO

    October 16, 2025

    Did you know that Windows 11 has an emergency shutdown feature? Here’s where to find it

    October 15, 2025

    This Thiel-backed venture allows doping in its own sports

    October 15, 2025
    Facebook X (Twitter) Instagram
    Trending
    • Electric aircraft startup Beta Technologies seeks to raise $825M in IPO
    • Did you know that Windows 11 has an emergency shutdown feature? Here’s where to find it
    • This Thiel-backed venture allows doping in its own sports
    • 58% of CISOs are boosting AI security budgets
    • Enhanced Games founder on the controversial ‘future of sports’
    • 3 days left: Save up to $624 on your Disrupt 2025 Pass
    • Your next toilet could tell you to drink more water – here’s how it’ll know
    • Liberate bags $50M at $300M valuation to bring AI deeper into insurance back offices
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»AI»AI crawlers vs. web defenses: Cloudflare-Perplexity fight reveals cracks in internet trust
    AI

    AI crawlers vs. web defenses: Cloudflare-Perplexity fight reveals cracks in internet trust

    TechurzBy TechurzAugust 5, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Perplexity AI NUR REDAKTIONELL 16z9
    Share
    Facebook Twitter LinkedIn Pinterest Email


    A public war of words has erupted between cloud infrastructure leader Cloudflare and AI search company Perplexity, with both sides making serious allegations about each other’s technical competence in a dispute that industry analysts say exposes fundamental flaws in how enterprises protect content from AI data collection.

    The controversy began when Cloudflare published a scathing technical report accusing Perplexity of “stealth crawling” — using disguised web browsers to sneak past website blocks and scrape content that site owners explicitly wanted to keep away from AI training. Perplexity quickly fired back, accusing Cloudflare of creating a “publicity stunt” by misattributing millions of web requests from unrelated services to boost its own marketing efforts.

    Industry experts warn that the heated exchange reveals that current bot detection tools are failing to distinguish between legitimate AI services and problematic crawlers, leaving enterprises without reliable protection strategies.

    Cloudflare’s technical allegations

    Cloudflare’s investigation started after customers complained that Perplexity was still accessing their content despite blocking its known crawlers through robots.txt files and firewall rules. To test this, Cloudflare created brand-new domains, blocked all AI crawlers, and then asked Perplexity questions about those sites.

    “We discovered Perplexity was still providing detailed information regarding the exact content hosted on each of these restricted domains,” Cloudflare reported in a blog post. “This response was unexpected, as we had taken all necessary precautions to prevent this data from being retrievable by their crawlers.”

    The company found that when Perplexity’s declared crawler was blocked, it allegedly switched to a generic browser user agent designed to look like Chrome on macOS. This alleged stealth crawler generated 3-6 million daily requests across tens of thousands of websites, while Perplexity’s declared crawler handled 20-25 million daily requests.

    Cloudflare emphasized that this behavior violated basic web principles: “The Internet as we have known it for the past three decades is rapidly changing, but one thing remains constant: it is built on trust. There are clear preferences that crawlers should be transparent, serve a clear purpose, perform a specific activity, and, most importantly, follow website directives and preferences.”

    By contrast, when Cloudflare tested OpenAI’s ChatGPT with the same blocked domains, “we found that ChatGPT-User fetched the robots file and stopped crawling when it was disallowed. We did not observe follow-up crawls from any other user agents or third-party bots.”

    Perplexity’s ‘publicity stunt’ accusation

    Perplexity wasn’t having any of it. In a LinkedIn post that pulled no punches, the company accused Cloudflare of deliberately targeting its own customer for marketing advantage.

    The AI company suggested two possible explanations for Cloudflare’s report: “Cloudflare needed a clever publicity moment and we – their own customer – happened to be a useful name to get them one” or “Cloudflare fundamentally misattributed 3-6M daily requests from BrowserBase’s automated browser service to Perplexity.”

    Perplexity claimed the disputed traffic actually came from BrowserBase, a third-party cloud browser service that Perplexity uses sparingly, accounting for fewer than 45,000 of their daily requests versus the 3-6 million Cloudflare attributed to stealth crawling.

    “Cloudflare fundamentally misattributed 3-6M daily requests from BrowserBase’s automated browser service to Perplexity, a basic traffic analysis failure that’s particularly embarrassing for a company whose core business is understanding and categorizing web traffic,” Perplexity shot back.

    The company also argued that Cloudflare misunderstands how modern AI assistants work: “When you ask Perplexity a question that requires current information — say, ‘What are the latest reviews for that new restaurant?’ — the AI doesn’t already have that information sitting in a database somewhere. Instead, it goes to the relevant websites, reads the content, and brings back a summary tailored to your specific question.”

    Perplexity took direct aim at Cloudflare’s competence: “If you can’t tell a helpful digital assistant from a malicious scraper, then you probably shouldn’t be making decisions about what constitutes legitimate web traffic.”

    Expert analysis reveals deeper problems

    Industry analysts say the dispute exposes broader vulnerabilities in enterprise content protection strategies that go beyond this single controversy.

    “Some bot detection tools exhibit significant reliability issues, including high false positives and susceptibility to evasion tactics, as evidenced by inconsistent performance in distinguishing legitimate AI services from malicious crawlers,” said Charlie Dai, VP and principal analyst at Forrester.

    Sanchit Vir Gogia, chief analyst and CEO at Greyhound Research, argued that the dispute “signals an urgent inflection point for enterprise security teams: traditional bot detection tools — built for static web crawlers and volumetric automation — are no longer equipped to handle the subtlety of AI-powered agents operating on behalf of users.”

    The technical challenge is nuanced, Gogia explained, “While advanced AI assistants often fetch content in real-time for a user’s query — without storing or training on that data — they do so using automation frameworks like Puppeteer or Playwright that bear a striking resemblance to scraping tools. This leaves bot detection systems guessing between help and harm.”

    The path to new standards

    This fight isn’t just about technical details — it’s about establishing rules for AI-web interaction. Perplexity warned of broader consequences: “The result is a two-tiered internet where your access depends not on your needs, but on whether your chosen tools have been blessed by infrastructure controllers.”

    Industry frameworks are emerging, but slowly. “Mature standards are unlikely before 2026. Enterprises might still have to rely on custom contracts, robots.txt, and evolving legal precedents in the interim,” Dai noted. Meanwhile, some companies are developing solutions: OpenAI is piloting identity verification through Web Bot Auth, allowing websites to cryptographically confirm agent requests.

    Gogia warned of broader implications: “The risk is a balkanised web, where only vendors deemed compliant by major infrastructure providers are allowed access, thus favouring incumbents and freezing out open innovation.”

    CloudflarePerplexity cracks crawlers defenses Fight Internet reveals trust Web
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleMy go-to LLM tool just dropped a super simple Mac and PC app for local AI – why you should try it
    Next Article Get startup insights from Chef Robotics, NEA, and ICONIQ at Disrupt 2025
    Techurz
    • Website

    Related Posts

    Security

    Get T-Mobile 5G home internet for $30/month when you bundle with a phone line – here’s how

    October 11, 2025
    Security

    Fortra Reveals Full Timeline of CVE-2025-10035 Exploitation

    October 10, 2025
    Security

    This new Google Gemini model scrolls the internet just like you do – how it works

    October 10, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    The Reason Murderbot’s Tone Feels Off

    May 14, 20259 Views

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    CNET’s Daily Tariff Price Tracker: I’m Keeping Tabs on Changes as Trump’s Trade Policies Shift

    May 27, 20258 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    The Reason Murderbot’s Tone Feels Off

    May 14, 20259 Views

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    CNET’s Daily Tariff Price Tracker: I’m Keeping Tabs on Changes as Trump’s Trade Policies Shift

    May 27, 20258 Views
    Our Picks

    Electric aircraft startup Beta Technologies seeks to raise $825M in IPO

    October 16, 2025

    Did you know that Windows 11 has an emergency shutdown feature? Here’s where to find it

    October 15, 2025

    This Thiel-backed venture allows doping in its own sports

    October 15, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2025 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.