Developer Offer
Try ImaginePro API with 50 Free Credits
Build and ship AI-powered visuals with Midjourney, Flux, and more — free credits refresh every month.
Apple Taps Google AI To Build New Image Editing Dataset

In a surprising move that highlights cross-industry collaboration in AI development, Apple has released Pico-Banana-400K, a massive research dataset containing 400,000 images. What makes this release particularly noteworthy is that Apple leveraged Google’s powerful Gemini-2.5 models to build this resource for the AI research community.
Apple's research team has published their work in a study titled “Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing.” Alongside the paper, the full 400,000-image dataset has been made available under a non-commercial research license. This allows academics, students, and AI researchers everywhere to use and explore this high-quality data for non-profit projects, fostering innovation in the field.
Solving a Key Problem in AI Research
While AI image editing models have seen rapid advancements, with Google's Gemini-2.5-Flash-Image (also known as Nanon-Banana) leading the pack, the research community has faced a significant hurdle. As Apple’s researchers point out, progress has been hampered by a lack of suitable training data.
“Despite these advances, open research remains limited by the lack of large-scale, high-quality, and fully shareable editing datasets. Existing datasets often rely on synthetic generations from proprietary models or limited human-curated subsets... hindering the development of robust editing models.”
To address this gap, Apple took on the challenge of creating a comprehensive and publicly available dataset to serve as a new benchmark for the industry.
How Apple Built the Dataset with Google's AI
The creation of Pico-Banana-400K involved a sophisticated, multi-step process. First, Apple researchers selected a diverse range of real photographs from the public OpenImages dataset, ensuring a wide representation of people, objects, and scenes.
Yes, they actually used Comic Sans
They then defined 35 different types of image edits across eight categories. These ranged from simple adjustments to complex transformations:
- Pixel & Photometric: Add film grain or vintage filter
- Human-Centric: Create a Funko-Pop–style toy figure of a person
- Scene Composition: Change weather conditions (e.g., sunny to rainy)
- Object-Level Semantic: Relocate an object within the image
- Scale: Zoom in on a specific area
These prompts were fed to Google's Nano-Banana model, which generated the edited images. To ensure quality, Apple then used another Google model, Gemini-2.5-Pro, to automatically analyze and either approve or reject each result based on its quality and how well it followed the original instruction.

The final dataset is incredibly rich, including not only single-prompt edits but also multi-turn sequences that show iterative changes. It also contains 'preference pairs' that compare successful edits with failed ones, helping models learn what constitutes a good and bad result.

Advancing the Future of Image Editing
The Apple research team acknowledges that even state-of-the-art models have limitations, particularly with fine-grained spatial edits and typography. However, they express hope that Pico-Banana-400K will provide “a robust foundation for training and benchmarking the next generation of text-guided image editing models.”
For those interested in diving deeper, the full research paper is available on arXiv, and the complete dataset can be accessed on GitHub.
Compare Plans & Pricing
Find the plan that matches your workload and unlock full access to ImaginePro.
| Plan | Price | Highlights |
|---|---|---|
| Standard | $8 / month |
|
| Premium | $20 / month |
|
Need custom terms? Talk to us to tailor credits, rate limits, or deployment options.
View All Pricing Details

