Back to all posts

Apple Enters AI Image Race With STARFlow

2025-06-10Michael Nuñez4 minutes read
Apple
AI
Image Generation

Credit: VentureBeat made with Midjourney

Researchers at Apple from their machine learning division have announced a significant advancement in AI. They've created a new system for producing high-quality images that aims to compete with established diffusion models, which are the core technology behind well-known image tools like DALL-E and Midjourney.

This development, named "STARFlow," was outlined in a research paper released recently. Apple's team, working with academic partners, designed STARFlow to merge normalizing flows with autoregressive transformers. They report that this system delivers performance comparable to leading diffusion models.

This innovation arrives at an important juncture for Apple. The company has been under increasing scrutiny regarding its progress in artificial intelligence. During the recent Worldwide Developers Conference (WWDC), Apple revealed what some considered minor AI enhancements to its Apple Intelligence platform. This situation underscores the competitive landscape for Apple, which many observers believe is trailing in AI development.

The research team, featuring Apple's Jiatao Gu, Joshua M. Susskind, and Shuangfei Zhai, alongside collaborators from universities like The University of California, Berkeley and Georgia Tech, stated, "To our knowledge, this work is the first successful demonstration of normalizing flows operating effectively at this scale and resolution."

Apple's Strategic Push in the AI Arena

The STARFlow research signifies Apple's wider strategy to create unique AI features that could set its products apart. While firms such as Google and OpenAI have garnered significant attention for their generative AI progress, Apple has been exploring different methods that might provide distinct benefits.

The team addressed a core problem in AI image creation: making normalizing flows effective for high-resolution images. Normalizing flows, which are generative models that learn to change simple data distributions into complex ones, have typically been less prominent than diffusion models or generative adversarial networks (GANs) for image generation tasks.

"STARFlow achieves competitive performance in both class-conditional and text-conditional image generation tasks, approaching state-of-the-art diffusion models in sample quality," the researchers noted. This demonstrates the system's flexibility in various image creation scenarios.

The Technical Innovation Behind STARFlow

Apple's researchers implemented several key innovations to bypass the constraints of current normalizing flow methods. The system uses a "deep-shallow design." This involves "a deep Transformer block [that] captures most of the model representational capacity, complemented by a few shallow Transformer blocks that are computationally efficient yet substantially beneficial."

The innovation also includes working within the "latent space of pretrained autoencoders, which proves more effective than direct pixel-level modeling," as stated in the paper. This method lets the model use compressed image data instead of raw pixels, greatly boosting efficiency.

Unlike diffusion models that depend on iterative denoising, STARFlow preserves the mathematical characteristics of normalizing flows. This allows for "exact maximum likelihood training in continuous spaces without discretization."

STARFlow's Potential Impact on Apple Products

This research emerges as Apple is under heightened pressure to show significant AI advancements. A recent Bloomberg report detailed how Apple Intelligence and Siri have found it difficult to keep pace with competitors. Apple's restrained announcements at the latest WWDC further highlighted these challenges in the AI domain.

For Apple, STARFlow's exact likelihood training could be advantageous for applications needing precise control over generated images or where understanding model uncertainty is crucial for decisions. This could be particularly useful for enterprise uses and the on-device AI capabilities Apple often highlights.

The research shows that alternative methods to diffusion models can yield similar results. This could pave the way for new innovations that leverage Apple's expertise in integrating hardware and software, as well as on-device processing.

Academic Partnerships Fueling Apple's AI R&D

This research illustrates Apple's approach of partnering with top academic institutions to boost its AI capabilities. Co-author Tianrong Chen, a PhD student at Georgia Tech and former intern with Apple's machine learning team, contributes expertise in stochastic optimal control and generative modeling.

The collaboration also features Ruixiang Zhang from U.C. Berkeley’s mathematics department and Laurent Dinh. Dinh is a machine learning researcher recognized for his pioneering work on flow-based models at Google Brain and DeepMind.

"Crucially, our model remains an end-to-end normalizing flow," the researchers stressed, differentiating their method from hybrid techniques that compromise mathematical manageability for better performance.

The complete research paper can be found on arXiv, offering technical specifics for those in the competitive generative AI field. Although STARFlow is a notable technical success, the ultimate challenge is whether Apple can transform such research into user-friendly AI features similar to what competitors like ChatGPT have achieved. For a company that revolutionized industries with products like the iPhone, the question is not if Apple can innovate in AI, but if it can do so quickly enough.

Read Original Post
ImaginePro newsletter

Subscribe to our newsletter!

Subscribe to our newsletter to get the latest news and designs.