head-to-head

xAI: Grok 4.20 vs Qwen: Qwen3.5-9B

Side-by-side comparison of specs, pricing, benchmark scores, and task rankings. Updated 2026-06-22.

xAI: Grok 4.20 Qwen: Qwen3.5-9B
Vendorx-aiqwen
Quality Score100100
Benchmark Score61.540.0
Input Price$1.25/M$0.10/M
Output Price$2.50/M$0.15/M
Context Window2,000,000256,000
Max Output-32,768
Tool Calling
Structured Output
Reasoning Mode
Vision
Audio--
Benchmark Scores
ai_index61.041.2
ai_index_coding-47.3
eqbench55.8-

Who wins by task?

TaskxAI: Grok 4.20Qwen: Qwen3.5-9B
SQL Generation 144 144
Code Review 150 139
Code Completion 122 129
Code Refactoring 153 138
Bug Fixing 154 143
Unit Test Generation 135 132
Code Documentation 141 130
Regex Writing 127 126
CI/CD Pipelines 131 126
Frontend Component Design 131 131
Data Analysis 136 138
CSV / Spreadsheet Cleanup 139 137
ETL Scripting 142 132
JSON Extraction 123 139
Bulk Data Labeling 120 132
OCR / Document Parsing 135 134
Table Extraction from PDFs 135 134
Long-Document Summarization 154 137
Short-Form Summarization 119 127
Blog Post Writing 132 125

Scores reflect capability match + benchmark data + pricing for each task. Methodology →

Related comparisons

MoonshotAI: Kimi K2.7 Code vs xAI: Grok 4.20 MoonshotAI: Kimi K2.7 Code vs Qwen: Qwen3.5-9B Qwen: Qwen3.7 Plus vs xAI: Grok 4.20 Qwen: Qwen3.7 Plus vs Qwen: Qwen3.5-9B MiniMax: MiniMax M3 vs xAI: Grok 4.20 MiniMax: MiniMax M3 vs Qwen: Qwen3.5-9B StepFun: Step 3.7 Flash vs xAI: Grok 4.20 StepFun: Step 3.7 Flash vs Qwen: Qwen3.5-9B