OpenAI: GPT-5 Image
GPT-5 Image is a paid model from OpenAI that accepts text, images, and files as input, with a 400,000-token context window and up to 128,000 tokens of output per response. It supports reasoning, which means it can work through multi-step problems before producing an answer. It does not support tool use, and structured output support is unconfirmed, so integrations requiring reliable JSON or function calling should be tested carefully before committing. At $10.00 per million tokens on both input and output, it sits at a premium price point relative to many available alternatives. There is currently no independent benchmark coverage to validate its performance claims, so buyers are working without comparative evidence. It is worth shortlisting for teams that specifically need long-context multimodal reasoning over documents and images, but the unproven benchmark standing means early adopters should run their own task-specific evaluations before treating it as a production default.
- Model ID
- openai/gpt-5-image
- Vendor
- openai
- Tokenizer
- GPT
- Input Modalities
- image, text, file
- Output Modalities
- image, text
- Max Output
- 128,000 tokens
- Tool Calling
- not supported
- Structured Output
- ✓ supported
- Reasoning Mode
- ✓ supported
- Vision
- ✓ accepts images
- Audio
- no
- Moderated
- yes
Strong choice for
Category rankings
Where OpenAI: GPT-5 Image places across the 1 category it ranks in. How we rank →
| # | Category | Score |
|---|---|---|
| #2 | Image GenerationVision · of 8 ranked | 105 |