Cost mode:

At a glance

Good enough on 20/35 tasks at the 95% bar. Cheapest qualifier on 2 tasks. Doesn't qualify on any: Content Summarization & Synthesis.

Provider
OpenAI
Model name
gpt-5.5
Qualifies on
20 / 35 tasks (at 90% bar)
Cheapest qualifier on
2 tasks

Cost vs quality across all tasks

$0.00113$0.00555$0.02742$0.13540$0.668540246810Quality score (0–10)Blended cost / callactivity_promo_generation autogenerated — quality 8.64, cost $0.012560at_content_domain_suggest autogenerated — quality 0.00, cost $0.015252Author Matching — quality 8.57, cost $0.035952author_living_check autogenerated — quality 8.95, cost $0.050700author_soul_generation autogenerated — quality 9.26, cost $0.142830auto_reddit_post_generation autogenerated — quality 7.63, cost $0.363085claim_extraction autogenerated — quality 6.55, cost $0.015800claim_refinement autogenerated — quality 6.96, cost $0.015992Claim-Referenced Analyst Writing (pooled) — quality 8.53, cost $0.096475content-summarization autogenerated — quality 5.73, cost $0.049652Direct Browse Content Synthesis — quality 0.00, cost $0.013865executive_summary_generation autogenerated — quality 8.50, cost $0.200965Generic TOC Extraction — quality 0.00, cost $0.010000image_prompt_generation autogenerated — quality 7.88, cost $0.031002Language Detection — quality 9.99, cost $0.001125LLM Prompt Adaptation — quality 8.60, cost $0.087592markdown_newline_repair autogenerated — quality 1.50, cost $0.668535metadata_paragraph_improvement autogenerated — quality 7.92, cost $0.006260onboarding_chapter_generation autogenerated — quality 0.00, cost $0.192310onboarding_chapter_prompt_generation autogenerated — quality 9.18, cost $0.433970 (anchor)onboarding_prospect_analysis autogenerated — quality 0.00, cost $0.026638ps_section_reassignment autogenerated — quality 0.00, cost $0.048405query_generation autogenerated — quality 8.99, cost $0.022538 (anchor)query_validation autogenerated — quality 8.18, cost $0.010298region_identification autogenerated — quality 8.68, cost $0.024655S1 TOC extraction — quality 0.00, cost $0.039648SEC Filling Analysis — quality 8.47, cost $0.072695 (anchor)sec-s1-chunk-analysis autogenerated — quality 8.84, cost $0.115035 (anchor)section_generation autogenerated — quality 8.91, cost $0.037160Social Post Promo (pooled) — quality 7.94, cost $0.019300structured_output_extraction autogenerated — quality 9.42, cost $0.047185subreddit_selection autogenerated — quality 5.69, cost $0.021258 (anchor)subreddit_vetting autogenerated — quality 0.00, cost $0.034960Substack Newsletter (pooled) — quality 8.85, cost $0.065500synthesis_analysis autogenerated — quality 8.01, cost $0.113260synthesis_of_titles_for_publication autogenerated — quality 8.11, cost $0.034648theme_generation autogenerated — quality 0.00, cost $0.057745Topic Discovery Clustering (pooled) — quality 6.38, cost $0.335212topic_client_matching autogenerated — quality 8.08, cost $0.109840topic_cluster_naming autogenerated — quality 8.59, cost $0.034990 (anchor)topic_clustering_assign_sections autogenerated — quality 8.73, cost $0.006658 (anchor)topic_sequence_determination autogenerated — quality 0.00, cost $0.039378Trading Recommendation — quality 8.73, cost $0.273628Translation — quality 8.21, cost $0.031788 (anchor)vetted_site_selection autogenerated — quality 0.00, cost $0.129475x_com_messages_for_promotion autogenerated — quality 0.00, cost $0.172675x_post_selection autogenerated — quality 8.47, cost $0.016400 (anchor)

qualifies at 90% bar · doesn't qualify · ★ this model is the best on that task. Lower + further right = cheaper + higher quality. Y-axis is log-scaled.

Per-task breakdown

CategoryTaskQualityConfidenceCost / callvs bestQualifies @ 90%
Structured Data & Fact Extractionclaim_extraction autogenerated6.55medium · n=100$0.015800-2788%no
Structured Data & Fact ExtractionGeneric TOC Extraction0.00low · n=0$0.010000no
Structured Data & Fact Extractionregion_identification autogenerated8.68high · n=22$0.024655-262%no
Structured Data & Fact ExtractionS1 TOC extraction0.00low · n=0$0.039648no
Structured Data & Fact Extractionstructured_output_extraction autogenerated9.42high · n=81$0.047185-1438%
Financial Analysis & Trading Decisionsonboarding_prospect_analysis autogenerated0.00low · n=0$0.026638no
Financial Analysis & Trading DecisionsSEC Filling Analysis8.47ranked · n=100$0.072695best
Financial Analysis & Trading Decisionssec-s1-chunk-analysis autogenerated8.84ranked · n=84$0.115035best
Financial Analysis & Trading Decisionssynthesis_analysis autogenerated8.01medium · n=29$0.113260-73%no
Financial Analysis & Trading DecisionsTrading Recommendation8.73ranked · n=80$0.273628-98%
Infrastructure & Utilityclaim_refinement autogenerated6.96medium · n=100$0.015992-669%no
Infrastructure & Utilityimage_prompt_generation autogenerated7.88ranked · n=100$0.031002-16%no
Infrastructure & UtilityLLM Prompt Adaptation8.60ranked · n=100$0.087592-98%no
Infrastructure & Utilitymarkdown_newline_repair autogenerated1.50low · n=1$0.668535no
Infrastructure & Utilitymetadata_paragraph_improvement autogenerated7.92high · n=91$0.006260-669%no
Infrastructure & Utilityonboarding_chapter_prompt_generation autogenerated9.18ranked · n=78$0.433970best
Infrastructure & Utilityquery_generation autogenerated8.99medium · n=71$0.022538best
Infrastructure & Utilityquery_validation autogenerated8.18high · n=100$0.010298-5993%no
Infrastructure & UtilityTranslation8.21high · n=90$0.031788best
Long-form Content Generationauthor_soul_generation autogenerated9.26high · n=93$0.142830-19%
Long-form Content GenerationClaim-Referenced Analyst Writing (pooled)8.53high · n=90$0.096475-83%
Long-form Content Generationonboarding_chapter_generation autogenerated0.00low · n=0$0.192310no
Long-form Content Generationsection_generation autogenerated8.91ranked · n=6$0.037160-669%
Long-form Content GenerationSubstack Newsletter (pooled)8.85low · n=4$0.065500-19%no
Long-form Content Generationtheme_generation autogenerated0.00low · n=0$0.057745no
Relevance, Classification & Matchingat_content_domain_suggest autogenerated0.00low · n=0$0.015252no
Relevance, Classification & MatchingAuthor Matching8.57ranked · n=100$0.035952-46%
Relevance, Classification & Matchingauthor_living_check autogenerated8.95high · n=64$0.050700-493%
Relevance, Classification & MatchingLanguage Detection9.99ranked · n=85$0.001125-904%
Relevance, Classification & Matchingsubreddit_selection autogenerated5.69high · n=62$0.021258best
Relevance, Classification & Matchingsubreddit_vetting autogenerated0.00low · n=0$0.034960no
Relevance, Classification & Matchingtopic_client_matching autogenerated8.08high · n=100$0.109840-3459%
Relevance, Classification & Matchingvetted_site_selection autogenerated0.00low · n=0$0.129475no
Relevance, Classification & Matchingx_post_selection autogenerated8.47ranked · n=99$0.016400best
Social & Promotional Contentactivity_promo_generation autogenerated8.64high · n=64$0.012560-9%
Social & Promotional Contentauto_reddit_post_generation autogenerated7.63medium · n=100$0.363085-624%no
Social & Promotional ContentSocial Post Promo (pooled)7.94high · n=100$0.019300-14091%
Social & Promotional Contentx_com_messages_for_promotion autogenerated0.00low · n=0$0.172675no
Content Summarization & Synthesiscontent-summarization autogenerated5.73high · n=100$0.049652-2320%no
Content Summarization & SynthesisDirect Browse Content Synthesis0.00low · n=0$0.013865no
Content Summarization & Synthesisexecutive_summary_generation autogenerated8.50low · n=1$0.200965no
Content Summarization & Synthesissynthesis_of_titles_for_publication autogenerated8.11ranked · n=67$0.034648-245%no
Topic Organization & Clusteringps_section_reassignment autogenerated0.00low · n=0$0.048405no
Topic Organization & ClusteringTopic Discovery Clustering (pooled)6.38medium · n=100$0.335212-10%no
Topic Organization & Clusteringtopic_cluster_naming autogenerated8.59ranked · n=89$0.034990best
Topic Organization & Clusteringtopic_clustering_assign_sections autogenerated8.73high · n=100$0.006658best
Topic Organization & Clusteringtopic_sequence_determination autogenerated0.00low · n=0$0.039378no