google

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Google's Nano Banana 2 is a multimodal model from Google that accepts both text and image inputs and returns up to 65,536 tokens per response within a 131,072-token context window. It supports reasoning, but does not support tool use or structured output. That combination makes it suited to tasks where vision understanding and extended context matter more than function calling or guaranteed output schemas. At $0.50 per million input tokens and $3.00 per million output tokens, the pricing sits in a budget-to-mid range for input but leans higher on output costs, so users with output-heavy workflows should factor that in. The harder limitation for comparison purposes is that the model has no independent benchmark coverage yet, meaning its quality relative to alternatives is unproven. Teams comfortable evaluating an unproven model against their own use cases may find it worth testing, but those who rely on published benchmarks to shortlist candidates will want to wait for third-party results before committing.

Quality Score
84/100
price + capability + benchmarks
Input Price
$0.50
per 1M tokens
Output Price
$3.00
per 1M tokens
Context Window
131,072
tokens
Model ID
google/gemini-3.1-flash-image-preview
Vendor
google
Tokenizer
Gemini
Input Modalities
image, text
Output Modalities
image, text
Max Output
65,536 tokens
Tool Calling
not supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
no
Moderated
no

Category rankings

Where Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) places across the 1 category it ranks in. How we rank →

#CategoryScore
#6 Image GenerationVision · of 8 ranked 99

Similar models