qwen

Qwen: Qwen3.6 Flash

Qwen3.6 Flash is a multimodal model from Qwen that accepts text, image, and video inputs and supports tool use and reasoning. Its context window reaches 1,000,000 tokens, and maximum output runs to 65,536 tokens per response. Structured output support is unconfirmed. That combination of long context, visual input, and reasoning puts it in a practical range for document-heavy or media-rich workflows where tool integration matters. On the comparison front, the pricing is low relative to many multimodal reasoning models, at $0.1875 per million input tokens and $1.125 per million output tokens, which makes volume use more affordable. The significant caveat is that Qwen3.6 Flash carries no independent benchmark coverage at this time, so quality relative to competing models is unverified. Buyers who are cost-sensitive and willing to run their own evaluations have a reasonable case to shortlist it; those who need validated performance data before committing should wait for third-party results.

Query via API → View on qwen → Estimate cost

Quality Score

100/100

price + capability + benchmarks

Input Price

$0.19

per 1M tokens

Output Price

$1.12

per 1M tokens

Context Window

1,000,000

tokens

Model ID: qwen/qwen3.6-flash
Vendor: qwen
Tokenizer: Qwen3
Input Modalities: text, image, video
Output Modalities: text
Max Output: 65,536 tokens
Tool Calling: ✓ supported
Structured Output: ✓ supported
Reasoning Mode: ✓ supported
Vision: ✓ accepts images
Audio: no
Moderated: no

Category rankings

Where Qwen: Qwen3.6 Flash places across the 3 categories it ranks in. How we rank →

#	Category	Score
#6	Video Auto-TaggingVideo · of 25 ranked	123
#16	Real-Time ChatLatency · of 25 ranked	117
#25	Video SummarizationVideo · of 25 ranked	139

Similar models

qwen

Qwen: Qwen3.6 Flash

Category rankings

Similar models

Qwen: Qwen3.7 Plus

Qwen: Qwen3.5 Plus 2026-04-20

Qwen: Qwen3.6 35B A3B

Qwen: Qwen3.6 27B

Qwen: Qwen3.6 Plus

Qwen: Qwen3.5-9B