ai scraping content stealthily

When a website puts up a “no AI bots allowed” sign, Perplexity apparently sees it as more of a suggestion.

Cloudflare’s research team caught the AI company red-handed, scraping sites that explicitly blocked AI crawlers. The scale? Tens of thousands of domains. Millions of requests daily. That’s not an accident.

Perplexity’s tactics read like a spy thriller. They’re changing user-agent strings to pretend they’re regular browsers. Switching between different networks to dodge IP blocks. Even using third-party scraping APIs like Crawlbase that handle the dirty work – rotating IPs, bypassing CAPTCHAs, rendering JavaScript. Some scrapers turn to solutions like Bright Data’s Web Unlocker API to break through HTTP request barriers and 403 Forbidden errors.

They’re masquerading as regular browsers, switching networks, using third-party scraping APIs to bypass blocks.

Clever? Sure. Ethical? That’s another story. The company’s CEO couldn’t even define plagiarism when asked during an interview, raising questions about their content ethics.

Cloudflare had to break out machine learning and network analysis.

References

You May Also Like

Musk Claims Grok 3.5 Abandons Internet Sources for Pure Reasoning

Musk’s Grok 3.5 abandons internet truth for “pure reasoning” – a million-GPU gamble that challenges everything we know about AI verification. Is this genius or madness?

AI Takes Over: TikTok Fires UK Human Moderators as Online Safety Act Looms

TikTok fires hundreds of UK moderators for AI that misses 15% of violations while regulators threaten £18 million fines.

ChatGPT: The Controversial AI Tool 79% of Lawyers Can’t Resist

79% of lawyers secretly use ChatGPT while 63.6% of people say it shouldn’t give legal advice. The profession faces an identity crisis.

AI Job Interviews Silently Discriminate Against Vulnerable Australians, Research Reveals

AI hiring tools silently reject minorities while claiming to reduce bias. Data shows 85% preference for white names, zero preference for Black men. Your resume might be judged by algorithms you can’t challenge.