head-to-head

StepFun: Step 3.7 Flash vs xAI: Grok Build 0.1

Side-by-side comparison of specs, pricing, benchmark scores, and task rankings. Updated 2026-06-22.

StepFun: Step 3.7 Flash xAI: Grok Build 0.1
Vendorstepfunx-ai
Quality Score100100
Benchmark Score48.0-
Input Price$0.20/M$1.00/M
Output Price$1.15/M$2.00/M
Context Window256,000256,000
Max Output256,000-
Tool Calling
Structured Output
Reasoning Mode
Vision
Audio--
Benchmark Scores
ai_index49.1-
ai_index_agentic35.5-
ai_index_coding61.6-

Who wins by task?

TaskStepFun: Step 3.7 FlashxAI: Grok Build 0.1
SQL Generation 152 130
Code Review 145 126
Code Completion 129 116
Code Refactoring 143 127
Bug Fixing 154 130
Unit Test Generation 138 121
Code Documentation 132 125
Regex Writing 129 119
CI/CD Pipelines 131 117
Frontend Component Design 135 122
Data Analysis 149 124
CSV / Spreadsheet Cleanup 140 127
ETL Scripting 137 122
JSON Extraction 142 123
Bulk Data Labeling 133 121
OCR / Document Parsing 137 128
Table Extraction from PDFs 137 128
Long-Document Summarization 141 129
Short-Form Summarization 128 115
Blog Post Writing 129 118

Scores reflect capability match + benchmark data + pricing for each task. Methodology →

Related comparisons

MoonshotAI: Kimi K2.7 Code vs StepFun: Step 3.7 Flash MoonshotAI: Kimi K2.7 Code vs xAI: Grok Build 0.1 Qwen: Qwen3.7 Plus vs StepFun: Step 3.7 Flash Qwen: Qwen3.7 Plus vs xAI: Grok Build 0.1 MiniMax: MiniMax M3 vs StepFun: Step 3.7 Flash MiniMax: MiniMax M3 vs xAI: Grok Build 0.1 StepFun: Step 3.7 Flash vs Google: Gemini 3.5 Flash StepFun: Step 3.7 Flash vs Google: Gemini 3.1 Flash Lite