~google

Google Gemini Pro Latest

Google Gemini Pro Latest is a multimodal model from Google that accepts text, images, audio, video, and files as input. It supports tool use and reasoning, offers a context window of 1,048,576 tokens, and can produce up to 65,536 tokens per response. Structured output support is unconfirmed at this time. At $2.00 per million input tokens and $12.00 per million output tokens, this model sits in a mid-to-upper pricing tier. The primary consideration for comparison shoppers is that it currently has no independent benchmark coverage, so its relative quality against competitors is unverified by third-party testing. It is worth shortlisting for teams that specifically need long-context multimodal processing across diverse file types, but buyers who require validated performance data before committing budget should treat it as unproven until benchmark results become available.

Quality Score
100/100
price + capability + benchmarks
Input Price
$2.00
per 1M tokens
Output Price
$12.00
per 1M tokens
Context Window
1,048,576
tokens
Model ID
~google/gemini-pro-latest
Vendor
~google
Tokenizer
Router
Input Modalities
audio, file, image, text, video
Output Modalities
text
Max Output
65,536 tokens
Tool Calling
✓ supported
Structured Output
✓ supported
Reasoning Mode
✓ supported
Vision
✓ accepts images
Audio
✓ accepts audio
Moderated
no

Strong choice for

Category rankings

Where Google Gemini Pro Latest places across the 4 categories it ranks in. How we rank →

#CategoryScore
#4 TTS ReplacementVoice · of 19 ranked 115
#10 TranscriptionVoice · of 19 ranked 115
#10 Audio SummarizationVoice · of 19 ranked 139
#22 Video SummarizationVideo · of 25 ranked 139

Similar models