Back to all posts

Alibaba Qwen AI Now Edits Images Like A Pro

2025-08-20Jonathan Kemper4 minutes read
AI
Image Editing
Alibaba

Alibaba upgrades its Qwen image model with visual and semantic image editing

Alibaba has just rolled out a significant update to its Qwen image model, introducing a powerful suite of editing tools capable of both minor visual tweaks and major semantic transformations.

A Leap Forward in AI Image Editing

The new model, named Qwen-Image-Edit, is built upon the robust foundation of Alibaba's 20-billion-parameter Qwen-Image model. It employs a sophisticated dual-processing strategy. For semantic understanding and control, it leverages Qwen2.5-VL, while a Variational Autoencoder (VAE) is tasked with managing the visual appearance and fidelity of the image. This combination allows the system to handle a wide spectrum of edits, from simple touch-ups to complex conceptual changes.

Two Modes for Creative Control

Qwen-Image-Edit offers two distinct workflows to suit different creative needs:

  • Appearance Editing: This mode allows users to make precise changes to specific areas of an image while ensuring the rest of the composition remains completely untouched. It's ideal for tasks like removing stray hairs, editing clothing, or changing background elements.
  • Semantic Editing: This powerful mode modifies pixels across the entire image to implement a new concept, such as changing the style or rotating an object, while maintaining the core identity and consistency of the main subject.

From Mascot Creation to Style Transfer

To showcase its semantic editing prowess, Alibaba demonstrated how the model can generate new intellectual property (IP) content featuring its Capybara mascot. Even with significant pixel changes across the image, the character remains instantly recognizable in various new roles and styles.

Eight illustrations of the Qwen Capybara mascot in various roles Qwen Image Edit generates new versions of the Capybara mascot that can be used as stickers in messenger apps and other formats. | Image: Alibaba

Other creative applications include generating new perspectives with 90 or 180-degree object rotations and performing style transfers, such as transforming a standard portrait into a Studio Ghibli-inspired avatar.

The model generates new viewpoints for people, animals, and objects. The model generates new viewpoints for people, animals, and objects. | Image: Alibaba

Intelligent Object and Background Manipulation

The model's capabilities extend to complex interactions within an image. It can seamlessly add new objects, like a wooden sign in front of a penguin colony, and realistically render corresponding shadows and reflections. This demonstrates a sophisticated understanding of light and environmental context.

Qwen Image Edit adds a wooden sign to a penguin colony with realistic shadows. Qwen Image Edit places a wooden sign reading "Welcome to Penguin Beach" in front of a penguin colony and generates natural shadows. | Image: Alibaba

Advanced Bilingual Text Editing

One of the standout features of Qwen Image Edit is its exceptional ability to edit text in both Chinese and English directly within images. The system can add, remove, or modify text while perfectly preserving the original font, size, and style, as seen in an example where Scrabble tiles are changed from "Health Insurance" to "Financial Planning."

Qwen Image Edit updates Scrabble tiles while maintaining the original look. Qwen Image Edit updates Scrabble tiles from "Health Insurance" to "Financial Planning," maintaining the original look. | Image: Alibaba

For corrections, users can simply draw bounding boxes around incorrect or unwanted text, and the model updates the selected areas. While it can occasionally be challenged by rare characters, the system supports a step-by-step refinement process, allowing users to mark specific spots for further edits until the result is perfect.

The tool replaces incorrect characters based on user-marked areas. The tool replaces incorrect characters and lets users directly mark the areas that need changes. | Image: Alibaba

Availability and Industry Context

Alibaba claims that Qwen Image Edit achieves state-of-the-art performance on public image editing benchmarks. The model is now accessible through the "Image Editing" feature in Qwen Chat and is also available for developers on Github, Hugging Face, and Modelscope.

This release marks a significant advancement in the field of targeted image editing, an area where AI models have historically struggled. It demonstrates how quickly the technology is moving beyond simple generation to provide nuanced and precise creative control.

Read Original Post
ImaginePro newsletter

Subscribe to our newsletter!

Subscribe to our newsletter to get the latest news and designs.