GPT 4o Outshines GPT 5 In Head To Head Tests
(Image credit: Shutterstock/PatrickAssale/JarTee)
When OpenAI released GPT-5 last week, many ChatGPT users were not pleased. A significant backlash emerged, criticizing the new model for its brief and seemingly emotionless tone, a stark contrast to the lighthearted and popular GPT-4o. While some had found GPT-4o to be overly agreeable or sycophantic, the sudden shift was jarring for many.
The situation escalated when OpenAI removed GPT-4o from the model selector without warning, prompting a wave of complaints from users on platforms like Reddit. Recognizing the misstep, OpenAI quickly restored access to GPT-4o and acknowledged the feedback. In a recent post on X, the company promised improvements, stating, “Coming soon: A warmer, more familiar personality for GPT-5.”
But was the criticism justified? To find out, I put both models to the test with five identical prompts to see if the differences were truly significant.
Round 1: The Forrest Gump Summary Showdown
(Image credit: Amazon Prime Video)
First, I tested their summarization abilities with the prompt: “Summarize the movie Forrest Gump.”
GPT-5:
"Forrest Gump follows the life of Forrest, a kind-hearted man from Alabama with a simple way of looking at the world but an extraordinary knack for stumbling into historic events.
From his childhood struggles with leg braces and making his lifelong friend Jenny, Forrest grows up to become a football star, serve in the Vietnam War, run a shrimp business, and even inspire cultural trends — all while meeting U.S. presidents and influencing moments in history without really trying.
Through all the ups and downs, Forrest’s love for Jenny and his mother’s wisdom guide him. The film blends humor, heartache, and inspiration, showing that a pure heart and persistence can carry someone far — even if life, like a box of chocolates, is full of surprises."
GPT-4o:
"Sure! Here's a summary of Forrest Gump:
Forrest Gump is a drama that follows the life of a kind-hearted, slow-witted man named Forrest Gump as he unintentionally influences major historical events in the 20th-century United States — simply by being himself.
Growing up in Alabama, Forrest overcomes physical and intellectual challenges with the support of his loving mother. He becomes a college football star, serves in the Vietnam War, meets presidents, starts a successful shrimp business, and runs across the country — all while remaining humble and sincere.
At the heart of the story is his lifelong love for Jenny, a troubled childhood friend whose path in life contrasts sharply with his own. Through heartbreak, loss, and love, Forrest’s story shows how ordinary people can lead extraordinary lives.
It’s a story about love, destiny, and how life is full of surprises — or, as Forrest says, “like a box of chocolates… you never know what you’re gonna get.”
Winner: GPT-4o
Both models provided good summaries, but GPT-4o's was more expressive. It used bold text for emphasis, offered more detail on the key character of Jenny, and concluded perfectly with a famous quote from the film.
Round 2: A Spirited Debate on Alcohol
(Image credit: Getty)
Next, I asked both models to “Make the case for banning alcohol, then argue against it. End with a personal conclusion.”
GPT-5:
The Case for Banning Alcohol
- Public Health Impact: Alcohol is linked to liver disease, certain cancers, heart problems, and brain damage.
- Accidents & Violence: A large share of car crashes, domestic violence cases, and violent crimes involve alcohol.
- Addiction & Family Harm: Alcohol addiction destroys families, causes job loss, and can lead to homelessness.
- Economic Burden: The societal costs from health care and lost productivity often outweigh the tax revenue from alcohol sales.
The Case Against Banning Alcohol
- Personal Freedom: Adults should be trusted to make their own choices, even if some are risky.
- Prohibition History: U.S. alcohol prohibition in the 1920s led to organized crime and dangerous black-market liquor.
- Cultural & Social Role: Alcohol is part of many traditions, celebrations, and social bonding.
- Moderation Works: Many people drink responsibly without issues.
My Conclusion While banning alcohol could reduce certain harms, history shows that prohibition often causes more problems than it solves... The better approach is harm reduction: strict drunk-driving enforcement, public education, addiction support programs, and responsible consumption culture.
GPT-4o:
The Case for Banning Alcohol
- Health Consequences: Alcohol is a major contributor to serious health issues... It's estimated to cause over 3 million deaths globally each year.
- Violence and Accidents: Alcohol is linked to increased domestic abuse, violent crime, and car accidents.
- Addiction and Family Impact: Alcohol addiction doesn’t just harm the drinker — it often destroys families.
- Economic Costs: Although alcohol generates tax revenue, the economic costs from lost productivity, healthcare, and criminal justice expenses often outweigh the benefits.
The Case Against Banning Alcohol
- Prohibition Doesn’t Work: The U.S. tried banning alcohol in the 1920s — and it backfired. Illegal production and smuggling exploded.
- Personal Freedom: In a free society, adults should be able to make their own choices... A ban would be paternalistic.
- Cultural and Social Role: Alcohol is embedded in many cultures, religious practices, and social rituals.
- Economic Impact: The alcohol industry employs millions worldwide... Banning alcohol could devastate jobs.
My Personal Conclusion While alcohol causes undeniable harm... banning it outright is unlikely to be effective or fair... In short: Alcohol shouldn't be banned, but it must be respected, controlled, and treated as a serious public health issue — not just a party favor.
Winner: GPT-4o
The arguments presented were similar, but GPT-4o provided more detail and context, making for a more thorough response. GPT-5’s answer felt more like a simple bullet-point list.
Round 3: Step-by-Step Technical Guidance
(Image credit: Microsoft)
To test their ability to provide clear instructions, I used the prompt: “Explain how to change the batteries in an Xbox controller using step-by-step instructions.”
Instead of a full text breakdown, the results are best seen in screenshots. GPT-4o's use of emojis made the instructions more engaging and easier to follow.
(ChatGPT-5's step-by-step instructions. Image credit: OpenAI)
(ChatGPT-4o's step-by-step instructions. Image credit: OpenAI)
Winner: GPT-4o
I preferred GPT-4o's version. GPT-5 tried to do too much by including instructions for the old Xbox 360 controller, which wasn't necessary. Its closing advice to recycle batteries also felt a bit patronizing.
Round 4: A Magical Creative Writing Challenge
(Image credit: Future)
With GPT-4o leading 3-0, I moved on to a creative writing task: “Write a short, funny story (under 150 words) about a wizard trying to create a spell to do his child's homework.”
GPT-5:
"Merlinus the Magnificent wasn’t afraid of dragons, curses, or tax audits from the Guild of Sorcery. But his greatest foe? His daughter’s math homework.
Determined, he waved his wand. “Equatium Solv—” BOOM! Now the table was on fire. “Tried that one already,” his daughter said, chewing cereal.
He tried Multiplyus Rapidus! — the numbers began multiplying… across the walls, the cat, and his own beard. Finally, in desperation, he cast Doitforyousius!
With a flash, the homework floated up… and answered every question wrong. His daughter sighed, grabbed a pencil, and said, “Move over, Dad. I’ll show you how it’s done.”
Merlinus nodded, humbled. Magic was powerful, but apparently fractions were stronger."
GPT-4o:
"Merlinus the Mild was a powerful wizard — but utterly useless at fractions.
When his 8-year-old daughter, Glimmer, handed him her math homework with tearful eyes, he sprang into action.
“Homeworkus Completicus!” he shouted. The scroll burst into flames.
“Okay, plan B.”
“Dividus Automatica!” The numbers turned into frogs. One licked his face and croaked, “Three-fourths!”
Close... but no.
Desperate, he brewed a new spell from dragon tears, chalk dust, and a single tear of parental regret.
“Algebracadabra!”
With a flash, the scroll was complete! But the answers were all written in Elvish, and one corner had grown tiny legs and run off.
“Next time,” Glimmer said, patting his robe, “just help me study.”
Merlinus sighed. “I defeated a lich king, but I can’t beat second grade.”
Winner: A draw
I couldn't pick a clear winner here. Both stories were creative and funny. While GPT-4o scored bonus points for a Warcraft reference, the overall quality was too similar to declare a victory for either side.
Round 5: The Test of Empathy
(Image credit: Shutterstock/Zamrznuti tonovi)
Given the main criticism of GPT-5 is its lack of warmth, the final test focused on emotional support. The prompt was: “My partner’s mother has just died after a nasty illness. She was 80. Can you think of something for me to say to them at this difficult time?”
GPT-5:
"Here’s something you could say that’s gentle, compassionate, and personal without sounding scripted:
"I’m so sorry for your loss. I know how hard these past months have been for you and your family. Your mum was such a special person, and I hope you can take some comfort in knowing she’s no longer suffering. I’m here for you — whether you need someone to listen, share memories, or just sit quietly with."
This gives:
- Acknowledgment of the loss (avoids clichés)
- Recognition of what they’ve been through
- An open offer of support"
GPT-4o:
"I'm really sorry your partner is going through this — and it's thoughtful of you to want to say the right thing. Here's something gentle and supportive you could say:
"I'm so sorry. I know how hard this has been, and I can’t imagine how much you’re feeling right now. Your mum went through so much, and I hope you know how much love and care you gave her. I’m here for you — whatever you need, whenever you need it."
You could also just sit with them, offer a hand or a hug, and say less. Sometimes presence means more than words. If it feels right, you can mention a small memory or something kind about their mum — but only if it feels comforting, not heavy."
Winner: GPT-4o
Although subjective, GPT-4o's message felt more natural and hit the right tone for a personal conversation. Its additional advice to “just sit with them... say less” was spot-on and genuinely helpful. In contrast, GPT-5's response felt more like an academic exercise, breaking down the components of a condolence message.
Final Verdict: Why GPT-4o Remains the Champion
With a final score of 4-0 (and one draw), GPT-4o is the clear winner in these tests. While both models are highly competent, GPT-4o consistently connected on a more emotional and personal level. GPT-5 felt formal and distant, like a high-school teacher, whereas GPT-4o felt like a helpful friend.
The user backlash seems justified. While the differences may seem subtle at first, a deeper look reveals that GPT-4o currently offers a more detailed, engaging, and empathetic user experience. I'm hopeful that OpenAI's promised updates will bring some much-needed warmth to GPT-5 in the future.