Qwen: Qwen3.5-35B-A3B
Qwen3.5-35B-A3B is a multimodal model from Qwen that accepts text, image, and video inputs, making it applicable to workflows that involve mixed media rather than text alone. It supports tool use and reasoning, which expands its utility for agentic tasks and multi-step problem solving. The context window reaches 262,144 tokens, meaning long documents or extended conversations fit within a single session without truncation. At $0.14 per million input tokens and $1.00 per million output tokens, this model sits on the affordable end of the pricing spectrum, which makes it worth considering for high-volume or cost-sensitive deployments. Its blended benchmark score of 47.9 is drawn from only one independent benchmark, so that figure should be treated as a limited signal rather than a well-rounded performance profile. Teams needing multimodal coverage at a low input cost may find it practical, but buyers prioritizing proven benchmark breadth should weigh that thin coverage carefully before committing.
- Model ID
- qwen/qwen3.5-35b-a3b
- Vendor
- qwen
- Tokenizer
- Qwen3
- Input Modalities
- text, image, video
- Output Modalities
- text
- Max Output
- 262,144 tokens
- Tool Calling
- ✓ supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- no
Category rankings
Where Qwen: Qwen3.5-35B-A3B places across the 8 categories it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #18 | Image CaptioningVision · of 25 ranked | 120 |
| #19 | Social Media PostsWriting · of 25 ranked | 119 |
| #19 | Voice Assistant BackendVoice · of 25 ranked | 123 |
| #19 | Video Auto-TaggingVideo · of 25 ranked | 123 |
| #20 | Video SummarizationVideo · of 25 ranked | 140 |
| #20 | Self-Hosted / LocalCost · of 25 ranked | 117 |
| #20 | Real-Time ChatLatency · of 25 ranked | 117 |
| #25 | Cheap Bulk InferenceCost · of 25 ranked | 137 |