ChatGPT Image Generation Creative But Flawed
Overall Score: 6.0 / 10
ChatGPT's venture into AI image generation has been met with anticipation, particularly for its text-to-image capabilities. Here's a quick look at its strengths and weaknesses as reviewed by CNET:
Pros
- Available with both free and paid ChatGPT plans
- Offers decent adherence to initial prompts
- Capable of producing a variety of versatile styles
Cons
- Suffers from long image generation times
- Images with text are often riddled with typos
- Lacks essential advanced editing tools
ChatGPT is a name practically synonymous with generative AI, so OpenAI's move into the AI image business wasn't entirely surprising. With GPT-4o's new text-to-image features, the internet saw a surge of Studio Ghibli-esque images created by ChatGPT. As CNET's AI image and video reviewer, I was excited by the promise that ChatGPT could generate images with substantial text, a common failing point for other AI services. However, extensive testing revealed that while ChatGPT excels as an AI chatbot, it's still an amateur in image generation.
I spent many hours with ChatGPT, generating nearly 100 images using OpenAI's models like 4o, o3, and o4-mini. There are similarities to OpenAI's other image tool, DALL-E, such as the conversational editing flow which contributed to DALL-E 3 winning a CNET Editor's Choice award. But the differences aren't all positive.
ChatGPT is slower than any other program I've tested, including DALL-E. Unlike Canva, Midjourney, and Leonardo.Ai, ChatGPT only produces one image per prompt, limiting variety. Most critically, it lacks advanced editing capabilities, which becomes a major hurdle for text-heavy images. The closest competitor is Meta AI, also primarily a chatbot. Between them, ChatGPT offers more creative and versatile results.
If you're familiar with ChatGPT, the interface will feel natural. However, users experienced with graphic design software or other AI image generators will likely be disappointed. OpenAI's image generation is best for those seeking a quick image tool or individuals less creatively inclined who want to visualize ideas without complex processes. It's not suited for professional creators aiming to enhance their digital toolkit.
Here's a breakdown of ChatGPT's performance based on my tests, covering image quality, prompt matching, fine-tuning capabilities, and generation speed.
(Disclosure: Ziff Davis, CNET's parent company, filed a lawsuit against OpenAI in April, alleging copyright infringement in the training and operation of its AI systems.)
How AI Image Generators Are Tested
CNET adopts a practical approach to reviewing AI image generators. The goal is to assess their quality relative to competitors and identify their best use cases. This involves using AI prompts based on real-world scenarios, such as rendering in specific styles, combining elements, and handling detailed descriptions. Generators are scored on a 10-point scale considering prompt adherence, result creativity, and response speed. See how CNET tests AI for more details.
You can create images with ChatGPT on any plan, with limited access on the free tier. Upgrading to Plus ($20/month) or Pro ($200/month) plans provides higher rate limits and other ChatGPT features. Even with paid plans, server overload can cause wait times.
OpenAI's general privacy policy, which covers image generation, states it can use your personal information and content to improve services, including your prompts and uploaded files. To opt out, navigate to Settings > Data control and disable Improve the model for everyone. Abusive or illegal uses are prohibited according to GPT-4o's system card and OpenAI's safety policies.
Image Quality and Prompt Accuracy
ChatGPT is quite versatile in style, handling both photorealistic stock imagery and whimsical, cartoonish looks. The images are generally fun and engaging. ChatGPT images do not have visible watermarks indicating AI generation, so disclosure is important. However, they do contain C2PA metadata identifying them as AI-generated.
A key differentiator is ChatGPT's ability to generate text-heavy images. While most AI image generators struggle with text, ChatGPT is the most impressive I've seen for creating legible English text.
Created by Katelyn Chedraoui using ChatGPT
However, creating meaningful text constructions is where ChatGPT falters. I attempted to generate educational posters, which require descriptions but also allow for creativity. Some were acceptable, others were disastrous.
This took me more than 20 prompts and 2 hours to create, and there's still a typo. Created by Katelyn Chedraoui using ChatGPT
ChatGPT performs best with text-heavy images when provided a website or text passage for reference, but even then, it's not always accurate.
I'm a big Wishbone Kitchen fan, but ChatGPT seriously misread Meredith Hayden's French hot cocoa recipe. Created by Katelyn Chedraoui using ChatGPT
And this attempt was particularly perplexing:
This was the best iteration of the bunch, if you ignore Mr. Darcy's eyes. Created by Katelyn Chedraoui using ChatGPT
Despite OpenAI's policies against reproducing copyrighted imagery, I sometimes succeeded. ChatGPT created a fake product photo of a Hydroflask water bottle with the real logo, even after the chatbot stated it shouldn't. Attempts to recreate the Teenage Mutant Ninja Turtles by name were blocked, but describing them as "human-like turtles with colorful masks" yielded very similar results, likely due to OpenAI's broad training data. (Note: OpenAI faces lawsuits from companies and artists alleging copyright infringement.)
Other image generators I've tested haven't replicated brand logos like ChatGPT. It's advisable to avoid creating AI images that could infringe on protected content.
Realistic-looking protected content I generated with ChatGPT. Created by Katelyn Chedraoui using ChatGPT
I primarily tested ChatGPT on a laptop via the website but also created a few images using the iPhone app. The experiences were similar, and mobile creation is a nice option.
A cartoonish version of my beach photo created using the ChatGPT iPhone app. Created by Katelyn Chedraoui using ChatGPT
ChatGPT demonstrates good prompt adherence for initial requests. Practicing good prompt writing by including details about aesthetic, style, characters, and setting helps. ChatGPT struggles more with post-generation edits, so crafting a detailed initial prompt is recommended.
Engagement Factor of Generated Images
The images created with ChatGPT were quite engaging, colorful, and bold. The model showed creativity, producing interesting images from varied prompts. While not as intricate or detailed as those from Midjourney or Leonardo.Ai, they were decent. Requesting fine details in your prompt can improve results.
I gave ChatGPT specific instructions in this prompt to be as detailed as possible. It's pretty good, but normal images without that instruction are more plain. Created by Katelyn Chedraoui using ChatGPT
Fine Tuning and Editing Capabilities
Yes, you can edit images post-generation by sending follow-up prompts. Highlighting a specific area for adjustment (by clicking the image and using the paintbrush icon) helps achieve more accurate changes. This is recommended for all edits.
For text-heavy images, fixing typos is possible but requires patience. Methods like highlighting the typo region, rerunning prompts, or directly telling ChatGPT about the error sometimes work. However, fixing one error often introduces new ones, a frustrating and usually unsuccessful process.
I also encountered issues where ChatGPT would cut off the left side of images, requiring manual requests to extend it. While an easy fix, such basic aspect ratio problems are unexpected from a company like OpenAI. Regenerating to fix this often led to unintended changes elsewhere.
These editing frustrations highlight the need for advanced editing tools similar to those in OpenAI's Sora video generator. The conversational flow is good for brainstorming but insufficient for detailed edits. This minimalist approach to editing features is a significant disappointment.
This birthday invitation is a good example of ChatGPT's abilities. It took some editing, as you can see through the progression. But the final result is super cute and usable. Created by Katelyn Chedraoui using ChatGPT
Generation Speed and Performance
ChatGPT has the longest generation time of any AI image service I've tested, taking 1 to 2 minutes per image, even with its latest GPT-4o mini model. Most competitors produce a batch of four images in 20 to 40 seconds. ChatGPT takes more than twice as long for a single image.
The slow loading speed is exasperating for users accustomed to near-instant responses. The energy required for text-heavy image generation likely contributes to this delay.
Final Verdict A Chatbot First Image Tool Second
After extensive testing, I was left disappointed. From a tech giant like OpenAI, I expected its image generation feature to match or surpass competitors. However, ChatGPT remains primarily a chatbot, and this was evident throughout testing. Its core function is text generation, but like its text outputs which can contain inaccuracies, its AI images with text were often wonky and typo-riddled.
The lack of advanced editing tools is a major drawback. Any method to easily remediate typos and errors would have significantly improved usability. Instead, fixing issues via prompting was frustrating, often creating new errors. The conversational flow is nice but no substitute for dedicated post-generation editing tools. The slow pace and single-image output would be more tolerable if the images were consistently good and easily editable.
It's hard to believe this is from the same company behind Sora, OpenAI's impressive video generator. I hope OpenAI integrates better tools into ChatGPT in the future. Otherwise, I'll stick to using ChatGPT for occasional search queries.
For more, check out CNET's guide to writing AI image prompts and list of the best AI chatbots.