Cost mode:

Category: Relevance, Classification & Matching · Rail: absolute · Typical I/O: 254→24 tokens

Models

Frontier on this task: Gemini 3 Flash Preview at 10.01 / 10. Quality bar at 95%: 9.51.

024681095% barGPT-5.4 nano$0.000040/call60% cheaperDeepSeek V4 Flash$0.000042/call58% cheaperGemini 3.1 Flash Lite$0.000050/call50% cheaperGemini 3 Flash Preview$0.000100/call0% cheaperMiniMax M2.5$0.000105/call-5% cheaperGPT-5.4 mini$0.000149/call-49% cheaperHaiku 4.5$0.000187/call-87% cheaperKimi K2.6$0.000202/call-102% cheaperGemini 3.1 Pro Preview$0.000398/call-298% cheaperDeepSeek V4 Pro$0.000525/call-425% cheaperClaude Sonnet 4.6$0.000561/call-461% cheaperGPT-5.5$0.000995/call-895% cheaper

point-estimate floor (CI low) · upper CI (less certain) · Bars sorted by blended cost; cheapest qualifier first.

Cost breakdown

ModelQualitySampleBlended cost / callSavings vs bestMode
GPT-5.4 nano OpenAI9.98 / 10 CI [9.80, 10.00]n=85 · ranked$0.00004060% cheaperbatch
DeepSeek V4 Flash DeepSeek9.94 / 10 CI [9.67, 10.00]n=91 · high$0.00004258% cheapersync
Gemini 3.1 Flash Lite Gemini9.99 / 10 CI [9.76, 10.00]n=84 · high$0.00005050% cheaperbatch
Gemini 3 Flash Preview best Gemini10.01 / 10 CI [9.98, 10.00]n=84 · ranked$0.000100(anchor)batch
MiniMax M2.5 MiniMax10.00 / 10 CI [9.80, 10.00]n=100 · ranked$0.000105sync
GPT-5.4 mini OpenAI9.99 / 10 CI [9.96, 10.00]n=85 · ranked$0.000149batch
Haiku 4.5 Anthropic9.99 / 10 CI [9.96, 10.00]n=82 · ranked$0.000187batch
Kimi K2.6 Moonshot AI9.98 / 10 CI [9.84, 10.00]n=100 · ranked$0.000202batch
Gemini 3.1 Pro Preview Gemini10.01 / 10 CI [9.98, 10.00]n=84 · ranked$0.000398batch
DeepSeek V4 Pro DeepSeek9.95 / 10 CI [9.71, 10.00]n=89 · high$0.000525sync
Claude Sonnet 4.6 Anthropic9.99 / 10 CI [9.96, 10.00]n=82 · ranked$0.000561batch
GPT-5.5 OpenAI9.99 / 10 CI [9.96, 10.00]n=85 · ranked$0.000995batch

Typical call shape for this task: 254 input tokens → 24 output tokens, EMA-tracked from production traffic. Blended cost = (in × in_price + out × out_price), rounded to 6 decimals.