Cost mode:

Category: Social & Promotional Content · Rail: absolute · Typical I/O: 1286→203 tokens

Models

Frontier on this task: Claude Opus 4.7 at 8.65 / 10. Quality bar at 95%: 8.21.

024681095% barQwen 3.5 Flash$0.000091/call99% cheaperQwen 3.6 Plus$0.000814/call93% cheaperGPT-5.4 mini$0.001878/call84% cheaperKimi K2.6$0.002034/call82% cheaperDeepSeek V4 Pro$0.002944/call74% cheaperGemini 3.1 Pro Preview$0.005008/call56% cheaperClaude Sonnet 4.6$0.006903/call40% cheaperClaude Opus 4.7$0.011505/call0% cheaperGPT-5.5$0.012520/call-9% cheaperGemini 3 Flash Preview$0.001252/call89% cheaperGemini 3.1 Flash Lite$0.000626/call95% cheaper

point-estimate floor (CI low) · upper CI (less certain) · Bars sorted by blended cost; cheapest qualifier first. Greyed rows are MEDIUM+ models whose point estimate clears the bar but whose CI low does not.

Cost breakdown

ModelQualitySampleBlended cost / callSavings vs bestMode
Qwen 3.5 Flash Alibaba Cloud (DashScope)8.46 / 10 CI [8.22, 8.70]n=85 · high$0.00009199% cheapersync
Qwen 3.6 Plus Alibaba Cloud (DashScope)8.48 / 10 CI [8.33, 8.64]n=85 · ranked$0.00081493% cheapersync
GPT-5.4 mini OpenAI8.62 / 10 CI [8.46, 8.77]n=64 · ranked$0.00187884% cheaperbatch
Kimi K2.6 Moonshot AI8.64 / 10 CI [8.52, 8.77]n=85 · ranked$0.00203482% cheaperbatch
DeepSeek V4 Pro DeepSeek8.58 / 10 CI [8.29, 8.86]n=67 · high$0.00294474% cheapersync
Gemini 3.1 Pro Preview Gemini8.33 / 10 CI [8.18, 8.48]n=61 · ranked$0.00500856% cheaperbatch
Claude Sonnet 4.6 Anthropic8.50 / 10 CI [8.16, 8.84]n=63 · medium$0.00690340% cheaperbatch
Claude Opus 4.7 best Anthropic8.65 / 10 CI [8.33, 8.96]n=63 · medium$0.011505(anchor)batch
GPT-5.5 OpenAI8.64 / 10 CI [8.43, 8.86]n=64 · high$0.012520batch

Typical call shape for this task: 1286 input tokens → 203 output tokens, EMA-tracked from production traffic. Blended cost = (in × in_price + out × out_price), rounded to 6 decimals.