google

Google: Gemini 2.5 Pro

Gemini 2.5 Pro is a paid model from Google that accepts text, images, files, audio, and video as inputs, making it one of the broader multimodal options available. Its context window reaches roughly one million tokens, which accommodates very long documents or extended conversations without truncation. The model supports tool use and reasoning, giving it utility in agentic workflows where multi-step problem-solving or external API calls are required. Structured output support is unconfirmed from available data. At $1.25 per million input tokens and $10.00 per million output tokens, it sits at a mid-to-upper price tier, so cost-sensitive users should weigh that carefully. Its blended benchmark score of 94.2 is strong, though that figure currently draws from only one independent benchmark, so treat it as a promising but limited signal rather than a broad verdict. Teams handling long multimodal workloads or complex reasoning tasks are the most natural fit here.

Quality Score
100/100
price + capability + benchmarks
Input Price
$1.25
per 1M tokens
Output Price
$10.00
per 1M tokens
Context Window
1,048,576
tokens
Model ID
google/gemini-2.5-pro
Vendor
google
Tokenizer
Gemini
Input Modalities
text, image, file, audio, video
Output Modalities
text
Max Output
65,536 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
✓ accepts audio
Moderated
no

Strong choice for

Category rankings

Where Google: Gemini 2.5 Pro places across the 4 categories it ranks in. How we rank →

#CategoryScore
#5 Audio SummarizationVoice · of 19 ranked 147
#11 Video SummarizationVideo · of 25 ranked 147
#14 TranscriptionVoice · of 19 ranked 115
#14 TTS ReplacementVoice · of 19 ranked 115

Similar models