Cost mode:

Category: Social & Promotional Content · Rail: absolute · Typical I/O: 3069→252 tokens

Models

Frontier on this task: DeepSeek V4 Pro at 8.23 / 10. Quality bar at 95%: 7.82.

024681095% barQwen 3.6 Plus$0.001489/call76% cheaperKimi K2.6$0.003924/call37% cheaperDeepSeek V4 Pro$0.006217/call0% cheaperClaude Sonnet 4.6$0.012987/call-109% cheaperGPT-5.5$0.022905/call-268% cheaperQwen 3.5 Flash$0.000158/call97% cheaperHaiku 4.5$0.004329/call30% cheaperClaude Opus 4.7$0.021645/call-248% cheaperDeepSeek V4 Flash$0.000500/call92% cheaperGemini 3 Flash Preview$0.002290/call63% cheaperGemini 3.1 Flash Lite$0.001145/call82% cheaperGemini 3.1 Pro Preview$0.009162/call-47% cheaperMiniMax M2.5$0.001223/call80% cheaperGPT-5.4 mini$0.003436/call45% cheaperGPT-5.4 nano$0.000929/call85% cheaper

point-estimate floor (CI low) · upper CI (less certain) · Bars sorted by blended cost; cheapest qualifier first. Greyed rows are MEDIUM+ models whose point estimate clears the bar but whose CI low does not.

Cost breakdown

ModelQualitySampleBlended cost / callSavings vs bestMode
Qwen 3.6 Plus Alibaba Cloud (DashScope)8.20 / 10 CI [8.03, 8.37]n=100 · ranked$0.00148976% cheapersync
Kimi K2.6 Moonshot AI8.21 / 10 CI [7.99, 8.44]n=100 · high$0.00392437% cheaperbatch
DeepSeek V4 Pro best DeepSeek8.23 / 10 CI [7.98, 8.48]n=96 · high$0.006217(anchor)sync
Claude Sonnet 4.6 Anthropic7.87 / 10 CI [7.63, 8.10]n=94 · high$0.012987batch
GPT-5.5 OpenAI7.94 / 10 CI [7.74, 8.15]n=100 · high$0.022905batch

Typical call shape for this task: 3069 input tokens → 252 output tokens, EMA-tracked from production traffic. Blended cost = (in × in_price + out × out_price), rounded to 6 decimals.