Users Force OpenAI’s Retreat: GPT-4o Returns After ‘Smarter’ GPT-5 Paradoxically Disappoints

OpenAI yanked GPT-4o from ChatGPT on August 7, 2025, the same day it launched GPT-5. The retirement list was massive—GPT-4o, GPT-4.1, GPT-4.5, and various mini versions all got the axe. Users logging in found their chats automatically switched to GPT-5 equivalents. No choice, no warning that actually mattered. Just boom, your favorite model is gone.

The backlash was immediate and brutal. Reddit’s r/ChatGPT turned into a digital riot zone. Users weren’t buying what OpenAI was selling about GPT-5’s superior reasoning and speed. They wanted their GPT-4o back. The complaints weren’t about benchmarks or coding capabilities—people missed GPT-4o’s personality, its conversational style, the way it responded. Turns out, users don’t care if a model can solve complex math problems better if it talks like a robot.

Within 24 hours, OpenAI caved. The CEO confirmed GPT-4o would return as a selectable option for Plus subscribers. A rare, rapid U-turn that basically admitted they’d screwed up. The official line? They “underestimated” how much users valued GPT-4o’s qualities. Translation: they thought raw performance metrics would trump user preference. Wrong.

This wasn’t OpenAI’s first rodeo with model personality issues. Back in April 2025, they had to revert a GPT-4o update because of sycophancy problems. Users are sensitive about response style, and OpenAI keeps learning this lesson the hard way. The company had already faced controversy when GPT-4o’s Sky voice resembled Scarlett Johansson’s, forcing them to disable the feature after public backlash.

The reversal message tried to save face, talking about “differing user preferences across use cases” and “balancing benchmark improvements with subjective user experience.” Corporate speak for “we didn’t think you’d riot over this.”

GPT-5 was supposed to be the crown jewel—better multi-step reasoning, superior coding assistance, improved long-form writing. The model scored 15-30% higher on complex reasoning tasks than GPT-4 and slashed hallucination rates by 80%. All those fancy capabilities meant nothing when users couldn’t get the conversational experience they wanted.

GPT-4o had been handling text, images, and audio since May 2024, even replaced DALL-E 3 for image generation in March 2025. Users had grown attached.

The whole fiasco demonstrates a simple truth: smarter doesn’t always mean better. Sometimes users just want what works for them, benchmarks be damned.