-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Description
Confirm this is a feature request for the Python library and not the underlying OpenAI API.
- This is a feature request for the Python library
Describe the feature or improvement you're requesting
Hi OpenAI team,
I’d like to report a documentation issue regarding image token accounting, which is currently ambiguous and can easily lead to incorrect cost estimation in production.
Problem
The Images & Vision documentation mixes two different image cost mechanisms across sections, without clearly stating which models each mechanism applies to:
BASE / TILE system
- Uses detail: low | high
- low → fixed base tokens
- high → base tokens + tiles × tile tokens
- Applies to models like GPT-4o, GPT-4.1, GPT-5, etc.
PATCH × MULTIPLIER system
- Uses patch-based calculation (ceil(w/32) × ceil(h/32))
- Multiplied by a model-specific multiplier
- Applies to models like GPT-5-mini / GPT-5-nano
These two systems are described in different sections, but the documentation does not explicitly state the model scope for each, which creates confusion.
Concrete example
The statement: “Using detail: low lets the model process the image with a budget of ~85 tokens.”
appears to be written as a general rule, while it is actually:
- true for GPT-4o-class models,
- false for models like 4o-mini or GPT-5-mini, which follow different accounting rules.
As a result, developers may assume a fixed low-detail cost when in reality the cost is model-dependent or resolution-dependent.
Why this matters
This is not just a wording issue:
- it affects cost predictability,
- it can cause unexpected billing behavior in production,
- and it forces developers to reverse-engineer pricing rules instead of relying on the docs.
Suggested improvement
It would greatly improve clarity if: each image cost formula explicitly listed the exact models it applies to, or the documentation clearly separated sections by cost system, not just by feature.
Thanks for your work, this feedback is shared in the spirit of making an already powerful API easier and safer to use at scale.
Additional context
No response