Voice · best for
Best AI model for Audio Summarization (2026)
Podcast and meeting summary from audio directly. Ranked from 346 live models on the OpenRouter catalog, weighted for audio input, context window, reasoning quality.
| # | Model | Score | In / 1M | Out / 1M | Context | |
|---|---|---|---|---|---|---|
| 1 | Google: Gemini 3.1 Flash Lite Previewgoogle/gemini-3.1-flash-lite-preview | 139 | $0.25 | $1.50 | 1,048,576 | Try → |
| 2 | Google: Gemini 3.1 Pro Preview Custom Toolsgoogle/gemini-3.1-pro-preview-customtools | 139 | $2.00 | $12.00 | 1,048,576 | Try → |
| 3 | Google: Gemini 3.1 Pro Previewgoogle/gemini-3.1-pro-preview | 139 | $2.00 | $12.00 | 1,048,576 | Try → |
| 4 | Google: Gemini 3 Flash Previewgoogle/gemini-3-flash-preview | 139 | $0.50 | $3.00 | 1,048,576 | Try → |
| 5 | Google: Gemini 2.5 Flash Lite Preview 09-2025google/gemini-2.5-flash-lite-preview-09-2025 | 139 | $0.10 | $0.40 | 1,048,576 | Try → |
| 6 | Google: Gemini 2.5 Flash Litegoogle/gemini-2.5-flash-lite | 139 | $0.10 | $0.40 | 1,048,576 | Try → |
| 7 | Google: Gemini 2.5 Flashgoogle/gemini-2.5-flash | 139 | $0.30 | $2.50 | 1,048,576 | Try → |
| 8 | Google: Gemini 2.5 Progoogle/gemini-2.5-pro | 139 | $1.25 | $10.00 | 1,048,576 | Try → |
| 9 | Google: Gemini 2.5 Pro Preview 06-05google/gemini-2.5-pro-preview | 139 | $1.25 | $10.00 | 1,048,576 | Try → |
| 10 | Google: Gemini 2.5 Pro Preview 05-06google/gemini-2.5-pro-preview-05-06 | 139 | $1.25 | $10.00 | 1,048,576 | Try → |
| 11 | Auto Routeropenrouter/auto | 139 | $-1000000.00 | $-1000000.00 | 2,000,000 | Try → |
| 12 | Xiaomi: MiMo-V2-Omnixiaomi/mimo-v2-omni | 133 | $0.40 | $2.00 | 262,144 | Try → |
| 13 | Google: Gemini 2.0 Flash Litegoogle/gemini-2.0-flash-lite-001 | 131 | $0.07 | $0.30 | 1,048,576 | Try → |
| 14 | Google: Gemini 2.0 Flashgoogle/gemini-2.0-flash-001 | 131 | $0.10 | $0.40 | 1,048,576 | Try → |
| 15 | OpenAI: GPT Audio Miniopenai/gpt-audio-mini | 109 | $0.60 | $2.40 | 128,000 | Try → |
How we ranked these
For Audio Summarization, we weight models on audio input, context window, reasoning quality. Higher means better. Scores combine OpenRouter's model metadata (context length, modality support, tool calling, structured output, reasoning capability) with public pricing. See full methodology →