ai scraping content stealthily

When a website puts up a “no AI bots allowed” sign, Perplexity apparently sees it as more of a suggestion.

Cloudflare’s research team caught the AI company red-handed, scraping sites that explicitly blocked AI crawlers. The scale? Tens of thousands of domains. Millions of requests daily. That’s not an accident.

Perplexity’s tactics read like a spy thriller. They’re changing user-agent strings to pretend they’re regular browsers. Switching between different networks to dodge IP blocks. Even using third-party scraping APIs like Crawlbase that handle the dirty work – rotating IPs, bypassing CAPTCHAs, rendering JavaScript. Some scrapers turn to solutions like Bright Data’s Web Unlocker API to break through HTTP request barriers and 403 Forbidden errors.

They’re masquerading as regular browsers, switching networks, using third-party scraping APIs to bypass blocks.

Clever? Sure. Ethical? That’s another story. The company’s CEO couldn’t even define plagiarism when asked during an interview, raising questions about their content ethics.

Cloudflare had to break out machine learning and network analysis.

References

You May Also Like

Rural Communities Wage David vs. Goliath Battle Against AI Data Centers

Tech giants promise prosperity while rural America pays the price with their water and power. Small towns are fighting back and winning.

AI System Falsely Promotes Racist Conspiracy Theory After Unauthorized Code Change

AI system fueled racist conspiracy theories while companies ignored employees’ warnings. How the quest for advanced AI created a monster. Regulators demand action.

Your Brain Tricks You: Scientists Reveal Why AI Images Fool Everyone

Your brain has a secret filing system that makes AI images indistinguishable from reality—and reveals disturbing racial biases you never knew existed.

RFK Jr.’s Health Report Drowning in Phantom Studies and Scientific Blunders

RFK Jr.’s health report bypasses CDC experts, references phantom studies, and contradicts established vaccine science—what dangerous precedents await America’s health policy?