Developer Offer

Try ImaginePro API with 50 Free Credits

Build and ship AI-powered visuals with Midjourney, Flux, and more — free credits refresh every month.

Mastering Image Generation with Google Nano Banana

2025-09-04•Unknown•7 minutes read

Image Generation

Google

The Rise of AI Image Generation

Generative AI for image creation has transformed how individuals and businesses produce visual content. These tools empower users to create specific visuals in seconds, bypassing the need for extensive design skills and accelerating tasks that traditionally took hours or days.

The market is filled with advanced image generation models like Stable Diffusion, Midjourney, DALL-E, and Google's own Imagen, each offering distinct advantages. Recently, Google made a significant leap forward with the release of Gemini 2.5 Flash Image, also known as nano-banana.

A self-portrait of a nano-banana character generated by AI. Image by Author | Gemini (nano-banana self portrait)

Introducing Google's Nano-Banana

Nano-banana is Google's state-of-the-art model for both generating and editing images. Its key features include creating highly realistic images, blending multiple images, maintaining consistent characters across different scenes, and applying targeted, prompt-based transformations. It offers a level of control that surpasses many previous models from Google and its competitors.

This guide will walk you through nano-banana's capabilities, demonstrating how to generate and edit images using both the Google AI Studio platform and the Gemini API in a Python environment.

Getting Started with Nano-Banana

To begin, you'll need a Google account to sign in to Google AI Studio. To use the Gemini API, you must also acquire an API key, which requires a paid plan.

For those who want to use the API with Python, install the necessary library with this command:

bash pip install google-genai

Once your account is ready, navigate to Google AI Studio and choose the Gemini-2.5-flash-image-preview model, which is the official name for nano-banana.

Selecting the Gemini 2.5 Flash Image Preview model in Google AI Studio.

Generating Your First Image

After selecting the model, you can start a new chat to generate an image. A key principle for achieving the best results is to describe the scene narratively, rather than just listing keywords. This descriptive approach helps the model better understand your vision.

In the AI Studio chat interface, you can enter your prompt in the text box.

The chat interface in Google AI Studio for entering image generation prompts.

Let's use a detailed prompt to generate a photorealistic image:

A photorealistic close-up portrait of an Indonesian batik artisan, hands stained with wax, tracing a flowing motif on indigo cloth with a canting pen. She works at a wooden table in a breezy veranda; folded textiles and dye vats blur behind her. Late-morning window light rakes across the fabric, revealing fine wax lines and the grain of the teak. Captured on an 85 mm at f/2 for gentle separation and creamy bokeh. The overall mood is focused, tactile, and proud.

Here is the generated image:

A photorealistic image of an Indonesian batik artisan working on a piece of cloth.

The resulting image is highly realistic and accurately reflects the detailed prompt. To achieve this same result using Python, you can use the following code snippet:

python from google import genai from google.genai import types from PIL import Image from io import BytesIO from IPython.display import display

Replace 'YOUR-API-KEY' with your actual API key

api_key = 'YOUR-API-KEY' client = genai.Client(api_key=api_key)

prompt = "A photorealistic close-up portrait of an Indonesian batik artisan, hands stained with wax, tracing a flowing motif on indigo cloth with a canting pen. She works at a wooden table in a breezy veranda; folded textiles and dye vats blur behind her. Late-morning window light rakes across the fabric, revealing fine wax lines and the grain of the teak. Captured on an 85 mm at f/2 for gentle separation and creamy bokeh. The overall mood is focused, tactile, and proud."

response = client.models.generate_content( model="gemini-2.5-flash-image-preview", contents=prompt, )

image_parts = [ part.inline_data.data for part in response.candidates[0].content.parts if part.inline_data ]

if image_parts: image = Image.open(BytesIO(image_parts[0])) # image.save('your_image.png') display(image)

Advanced Image Editing and Manipulation

While nano-banana excels at generating images from scratch, its real power lies in its editing capabilities. Let's explore how to modify the image we just created.

Prompt-Based Editing

We can make a small change by adding reading glasses to the artisan with a simple prompt:

Using the provided image, place a pair of thin reading glasses gently on the artisan's nose while she draws the wax lines. Ensure reflections look realistic and the glasses sit naturally on her face without obscuring her eyes.

The model edits the original image while keeping everything else consistent:

The same artisan now wearing a pair of thin reading glasses.

To perform this edit in Python, you provide the base image along with the new prompt:

python from PIL import Image

This code assumes 'client' has been configured from the previous step

base_image = Image.open('/path/to/your/photo.png') edit_prompt = "Using the provided image, place a pair of thin reading glasses gently on the artisan's nose..."

response = client.models.generate_content( model="gemini-2.5-flash-image-preview", contents=[edit_prompt, base_image])

Character Consistency

Let's generate a new scene while keeping the same person. This time, she will be looking at the camera and smiling.

Generate a new and photorealistic image using the provided image as a reference for identity: the same batik artisan now looking up at the camera with a relaxed smile, seated at the same wooden table. Medium close-up, 85 mm look with soft veranda light, background jars subtly blurred.

The result maintains the character's identity in a new pose:

The batik artisan looking up from her work and smiling at the camera.

Let's try an even more significant change, where she presents a finished cloth:

Create a product-style image using the provided image as identity reference: the same artisan presenting a finished indigo batik cloth, arms extended toward the camera. Soft, even window light, 50 mm look, neutral background clutter.

Even with a completely different scene, the character remains consistent:

The artisan presenting a finished indigo batik cloth towards the camera.

Style Transfer

Nano-banana can also transfer the style of an image. Let's change our photorealistic image into a watercolor painting.

Using the provided image as identity reference, recreate the scene as a delicate watercolor on cold-press paper: loose indigo washes for the cloth, soft bleeding edges on the floral motif, pale umbers for the table and background. Keep her pose holding the fabric, gentle smile, and round glasses; let the veranda recede into light granulation and visible paper texture.

The model successfully applies the new style while preserving the subject and composition:

A watercolor painting of the artisan holding the batik cloth.

Image Fusion

Finally, let's try fusing an object from one image into another. First, we'll generate an image of a hat:

An image of a straw hat with a decorative ribbon.

Now, we'll use a prompt to place this hat on our artisan's head in the watercolor image:

Move the same woman and pose outdoors in open shade and place the straw hat from the product image on her head. Align the crown and brim to the head realistically; bow over her right ear (camera left), ribbon tails drifting softly with gravity. Use soft sky light as key with a gentle rim from the bright background. Maintain true straw and lace texture, natural skin tone, and a believable shadow from the brim over the forehead and top of the glasses. Keep the batik cloth and her hands unchanged. Keep the watercolor style unchanged.

This process merges the two images. You can do this in Python by providing both images and the fusion prompt:

python from PIL import Image

This code assumes 'client' has been configured from the first step

base_image = Image.open('/path/to/your/photo.png') hat_image = Image.open('/path/to/your/hat.png') fusion_prompt = "Move the same woman and pose outdoors in open shade and place the straw hat..."

response = client.models.generate_content( model="gemini-2.5-flash-image-preview", contents=[fusion_prompt, base_image, hat_image])

For best results, it's recommended to use a maximum of three input images to avoid a reduction in output quality.

Final Thoughts

Google's Gemini 2.5 Flash Image, or nano-banana, is a powerful new tool in the world of AI image generation. Its greatest strength lies in editing existing images, allowing for remarkable transformations while maintaining consistency across a series of visuals.

Experiment with the model yourself. Iteration is key, as the perfect image often comes after a few attempts and prompt refinements.

Read Original Post

Compare Plans & Pricing

Find the plan that matches your workload and unlock full access to ImaginePro.

ImaginePro pricing comparison
Plan	Price	Highlights
Standard	$8 / month	300 monthly credits included Access to Midjourney, Flux, and SDXL models Commercial usage rights
Premium	$20 / month	900 monthly credits for scaling teams Higher concurrency and faster delivery Priority support via Slack or Telegram