ai scraping content stealthily

When a website puts up a “no AI bots allowed” sign, Perplexity apparently sees it as more of a suggestion.

Cloudflare’s research team caught the AI company red-handed, scraping sites that explicitly blocked AI crawlers. The scale? Tens of thousands of domains. Millions of requests daily. That’s not an accident.

Perplexity’s tactics read like a spy thriller. They’re changing user-agent strings to pretend they’re regular browsers. Switching between different networks to dodge IP blocks. Even using third-party scraping APIs like Crawlbase that handle the dirty work – rotating IPs, bypassing CAPTCHAs, rendering JavaScript. Some scrapers turn to solutions like Bright Data’s Web Unlocker API to break through HTTP request barriers and 403 Forbidden errors.

They’re masquerading as regular browsers, switching networks, using third-party scraping APIs to bypass blocks.

Clever? Sure. Ethical? That’s another story. The company’s CEO couldn’t even define plagiarism when asked during an interview, raising questions about their content ethics.

Cloudflare had to break out machine learning and network analysis.

References

You May Also Like

Unions Fight for Workers’ Freedom to Reject AI Systems in Workplace

Your boss might soon be an algorithm watching your every keystroke—but unions are fighting back with surprising new tactics.

The Perilous Delusions Fueling AI’s Relentless March Toward Superintelligence

Tech titans are betting billions on “superintelligent” AI while actual systems merely mimic understanding. Are we blindly following dangerous delusions? The gap widens daily.

AI’s Masterpiece Mimicry: Creative Revolution or Stealing Artists’ Soul?

Can AI create masterpieces or just steal artists’ souls? The creative revolution forces us to question who truly deserves credit when machines make museum-worthy art.

The Dark Side of ChatGPT: 4 Brutal Realities Users Face

ChatGPT’s privacy breaches, emotional manipulation, and catastrophic data losses affect millions daily while users remain dangerously unaware of these brutal realities.