ai scraping content stealthily

When a website puts up a “no AI bots allowed” sign, Perplexity apparently sees it as more of a suggestion.

Cloudflare’s research team caught the AI company red-handed, scraping sites that explicitly blocked AI crawlers. The scale? Tens of thousands of domains. Millions of requests daily. That’s not an accident.

Perplexity’s tactics read like a spy thriller. They’re changing user-agent strings to pretend they’re regular browsers. Switching between different networks to dodge IP blocks. Even using third-party scraping APIs like Crawlbase that handle the dirty work – rotating IPs, bypassing CAPTCHAs, rendering JavaScript. Some scrapers turn to solutions like Bright Data’s Web Unlocker API to break through HTTP request barriers and 403 Forbidden errors.

They’re masquerading as regular browsers, switching networks, using third-party scraping APIs to bypass blocks.

Clever? Sure. Ethical? That’s another story. The company’s CEO couldn’t even define plagiarism when asked during an interview, raising questions about their content ethics.

Cloudflare had to break out machine learning and network analysis.

References

You May Also Like

AI’s Hidden Presence: The Invisible Technology Reshaping Your Daily Routine

Think AI isn’t watching? From facial recognition to medical decisions, the technology silently puppeteers your daily choices. Your digital life isn’t entirely yours anymore.

AI ‘Friends’ or Real Connections? Meta’s Vision Clashes With What Users Actually Want

Can AI “friends” fix your loneliness or deepen it? Meta’s vision for digital companions clashes with experts’ warnings about authentic human connection. The future of friendship hangs in balance.

ID Verification for AI: OpenAI’s Controversial Gatekeeping Alarms Developers

Is OpenAI building walls instead of bridges? Their gatekeeping ID requirements block small developers while raising alarming bias concerns. Who decides AI’s future?

Millions Wasted: Alabama’s Prison Defense Firm Caught Submitting AI-Generated Fake Citations

Major law firm caught billing millions while submitting fake AI-generated citations threatens Alabama’s prison defense case.