Google: Gemini 2.5 Flash Lite
Gemini 2.5 Flash Lite is Google's multimodal model accepting text, images, files, audio, and video as input, with a context window of 1,048,576 tokens and a maximum of 65,535 output tokens per response. It supports tool use and reasoning, though structured output support is unconfirmed. The combination of broad input modalities and a very large context window makes it suitable for tasks involving mixed media or long documents. On price, it sits at $0.10 per million input tokens and $0.40 per million output tokens, placing it toward the budget end of the market. Benchmark coverage is thin, with a blended score of 9.9 across only one independent benchmark, so performance claims should be treated as preliminary. Teams with cost-sensitive, high-volume workloads who need multimodal input and large context should shortlist it, but those requiring validated quality benchmarks may want to wait for broader evaluation data.
- Model ID
- google/gemini-2.5-flash-lite
- Vendor
- Tokenizer
- Gemini
- Input Modalities
- text, image, file, audio, video
- Output Modalities
- text
- Max Output
- 65,535 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- ✓ accepts audio
- Moderated
- no
Category rankings
Where Google: Gemini 2.5 Flash Lite places across the 10 categories it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #7 | TranscriptionVoice · of 19 ranked | 123 |
| #8 | Audio SummarizationVoice · of 19 ranked | 140 |
| #12 | TTS ReplacementVoice · of 19 ranked | 115 |
| #13 | Social Media PostsWriting · of 25 ranked | 119 |
| #13 | Voice Assistant BackendVoice · of 25 ranked | 123 |
| #13 | Cheap Bulk InferenceCost · of 25 ranked | 137 |
| #13 | Real-Time ChatLatency · of 25 ranked | 118 |
| #14 | Self-Hosted / LocalCost · of 25 ranked | 117 |
| #19 | Video SummarizationVideo · of 25 ranked | 140 |
| #22 | Code CompletionCode · of 25 ranked | 132 |