Google: Gemini 3.1 Flash Lite Preview
Gemini 3.1 Flash Lite Preview is Google's multimodal offering that accepts text, images, video, audio, and file inputs. Its context window reaches 1,048,576 tokens, making it suited for long-document or multi-turn workloads. The model supports tool use and reasoning, though structured output support is unconfirmed from available data. At $0.25 per million input tokens and $1.50 per million output tokens, it sits in budget territory and may appeal to developers building high-volume pipelines who want multimodal flexibility without paying premium rates. The significant caveat is that no independent benchmark coverage currently exists, so its actual task performance relative to competitors is unproven. Buyers who need verified quality scores before committing should treat this as a model to monitor rather than one with an established track record.
- Model ID
- google/gemini-3.1-flash-lite-preview
- Vendor
- Tokenizer
- Gemini
- Input Modalities
- text, image, video, file, audio
- Output Modalities
- text
- Max Output
- 65,536 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- ✓ accepts audio
- Moderated
- no
Strong choice for
Category rankings
Where Google: Gemini 3.1 Flash Lite Preview places across the 4 categories it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #4 | TranscriptionVoice · of 19 ranked | 123 |
| #7 | TTS ReplacementVoice · of 19 ranked | 115 |
| #13 | Audio SummarizationVoice · of 19 ranked | 139 |
| #17 | Video Auto-TaggingVideo · of 25 ranked | 123 |