google

Google: Nano Banana (Gemini 2.5 Flash Image)

Google Nano Banana (Gemini 2.5 Flash Image) is a multimodal model from Google that accepts both image and text as input and returns text output. It works within a 32,768-token context window and supports up to 8,192 completion tokens. The model does not support tool use, reasoning modes, or structured output, so workflows that depend on any of those capabilities will need a different option. At $0.30 per million input tokens and $2.50 per million output tokens, it sits at a low-to-mid price point for image-capable models, which makes it worth considering for teams running image-plus-text tasks at volume on a budget. The comparison caveat is significant, though: there is no independent benchmark coverage available, so its quality relative to competing models is currently unproven. Buyers who need validated performance data before committing should wait for third-party evaluations or run their own representative tests first.

Quality Score
67/100
price + capability + benchmarks
Input Price
$0.30
per 1M tokens
Output Price
$2.50
per 1M tokens
Context Window
32,768
tokens
Model ID
google/gemini-2.5-flash-image
Vendor
google
Tokenizer
Gemini
Input Modalities
image, text
Output Modalities
image, text
Max Output
8,192 tokens
Tool Calling
not supported
Structured Output
✓ supported
Reasoning Mode
not supported
Vision
✓ accepts images
Audio
no
Moderated
no

Category rankings

Where Google: Nano Banana (Gemini 2.5 Flash Image) places across the 1 category it ranks in. How we rank →

#CategoryScore
#8 Image GenerationVision · of 8 ranked 82

Similar models