google

Google: Lyria 3 Clip Preview

Google: Lyria 3 Clip Preview is a text-and-image input model with a 1-million-token context window and a 65,536-token output ceiling. It does not support tool use, reasoning modes, or structured output, so its utility is limited to straightforward generative tasks that fit within those input modalities. The model is currently free, which makes it worth shortlisting for developers and researchers who want to experiment with large-context multimodal prompting at no cost. The significant caveat is that there is no independent benchmark coverage yet, so actual quality relative to paid alternatives is unverified. Users who need proven, measurable performance for production work should treat this as an exploratory option until third-party evaluations become available.

Quality Score
88/100
price + capability + benchmarks
Input Price
Free
per 1M tokens
Output Price
Free
per 1M tokens
Context Window
1,048,576
tokens
Model ID
google/lyria-3-clip-preview
Vendor
google
Tokenizer
Other
Input Modalities
text, image
Output Modalities
text, audio
Max Output
65,536 tokens
Tool Calling
not supported
Structured Output
✓ supported
Reasoning Mode
not supported
Vision
✓ accepts images
Audio
no
Moderated
no

Similar models