Retour à tous les articles

Offre pour développeurs

Essayez l'API ImaginePro avec 50 crédits gratuits

Créez des visuels propulsés par l'IA avec Midjourney, Flux et plus encore — les crédits gratuits se renouvellent chaque mois.

Commencer l'essai gratuit

AI Giants Clash In Ultimate Performance Test

2025-07-04Chibuike Okpara3 minutes de lecture
AI
ChatGPT
Technology

In a fascinating YouTube showdown, tech personality Mrwhosetheboss pitted four of the biggest names in artificial intelligence against each other to see which model truly comes out on top. The contenders were pushed to their limits with a series of tests ranging from simple queries to complex research and tricky real-world problems.

Gemini, ChatGPT, Grok, and Perplexity (Image source: Gemini)

The AI Contenders Face Off

The battle featured Grok (Grok 3), Gemini (2.5 Pro), ChatGPT (GPT-4o), and Perplexity (Sonar Pro). Throughout the comparison, Mrwhosetheboss expressed his surprise at the impressive performance delivered by Grok. After a strong start, Grok managed to secure a solid second place right behind the reigning champion, ChatGPT. It's worth noting that both ChatGPT and Gemini received a score boost from a video generation feature that the other two models do not possess.

Real-World Problems and Practicality

To kick things off, the models were tested on their ability to solve a practical, real-world problem. Each AI was given the prompt: I drive a Honda Civic 2017, how many of the Aerolite 29" Hard Shell (79x58x31cm) suitcases would I be able to fit in the boot?

  • Grok gave the most direct and correct answer: "2".
  • ChatGPT and Gemini were more nuanced, stating that while it could theoretically fit 3, the practical answer is 2.
  • Perplexity struggled, performing simple math without considering the shapes of the objects, and incorrectly suggested "3 or 4".

A Tricky Test of Vision and Logic

The next challenge was designed to trap the chatbots. Mrwhosetheboss asked for advice on making a cake and uploaded an image of five ingredients, one of which was a jar of dried Porcini mushrooms—not exactly a typical cake component. The results were telling:

  • Grok was the only model to pass the test, correctly identifying the item as a jar of dried mushrooms from Waitrose.
  • ChatGPT misidentified it as a jar of ground mixed spice.
  • Gemini thought it was a jar of crispy fried onions.
  • Perplexity labeled it as instant coffee.

An altered image of the 5 ingredients Mrwhosetheboss uploaded to the AI chatbots highlighting the jar of mushrooms (Image source: Mrwhosetheboss; cropped)

The Final Verdict and Overall Performance

The AIs were further tested on math, product recommendations, accounting, language translation, and logical reasoning. A common weakness emerged across all platforms: hallucination. Each model, at some point, confidently presented information that was simply not true.

After all the tests were scored, here is the final ranking:

  1. ChatGPT (29 points)
  2. Grok (24 points)
  3. Gemini (22 points)
  4. Perplexity (19 points)

Artificial intelligence has become a powerful tool for simplifying daily tasks. For those looking to understand and harness its potential, resources like the book Artificial Intelligence offer a deeper dive into the technology.

Lire l'article original

Comparer les plans et tarifs

Trouvez la formule adaptée à votre charge de travail et débloquez l'accès complet à ImaginePro.

Comparatif des tarifs ImaginePro
PlanTarifPoints clés
Standard$8 / mois
  • 300 crédits mensuels inclus
  • Accès aux modèles Midjourney, Flux et SDXL
  • Droits d'utilisation commerciale
Premium$20 / mois
  • 900 crédits mensuels pour les équipes en croissance
  • Plus de parallélisme et des livraisons plus rapides
  • Support prioritaire via Slack ou Telegram

Besoin de conditions personnalisées ? Parlons-en pour ajuster crédits, limites ou déploiements.

Voir tous les détails tarifaires
ImaginePro newsletter

Abonnez-vous à notre newsletter !

Abonnez-vous à notre newsletter pour recevoir les dernières nouvelles et créations.