Claude Opus 4.7 on DTP Benchmark

At a glance

Good enough on 14/35 tasks at the 95% bar. Cheapest qualifier on 1 task. Doesn't qualify on any: Content Summarization & Synthesis.

Provider: Anthropic
Model name: claude-opus-4-7
Qualifies on: 14 / 35 tasks (at 90% bar)
Cheapest qualifier on: 1 tasks

Cost vs quality across all tasks

qualifies at 90% bar · doesn't qualify · ★ this model is the best on that task. Lower + further right = cheaper + higher quality. Y-axis is log-scaled.

Per-task breakdown

Category	Task	Quality	Confidence	Cost / call	vs best	Qualifies @ 90%
Structured Data & Fact Extraction	claim_extraction autogenerated	7.04	high · n=100	$0.014290	-2512%	no
Structured Data & Fact Extraction	Generic TOC Extraction	0.00	low · n=0	$0.008750	—	no
Structured Data & Fact Extraction	region_identification autogenerated	9.15	ranked · n=25	$0.020898	-207%	✓
Structured Data & Fact Extraction	S1 TOC extraction	0.00	low · n=0	$0.034120	—	no
Structured Data & Fact Extraction	structured_output_extraction autogenerated	9.03	ranked · n=83	$0.041530	-1254%	no
Financial Analysis & Trading Decisions	onboarding_prospect_analysis autogenerated	0.00	low · n=0	$0.022960	—	no
Financial Analysis & Trading Decisions	SEC Filling Analysis	7.92	high · n=100	$0.061295	16%	no
Financial Analysis & Trading Decisions	sec-s1-chunk-analysis autogenerated	8.19	high · n=81	$0.100820	12%	no
Financial Analysis & Trading Decisions	synthesis_analysis autogenerated	8.77	high · n=25	$0.108825	-67%	✓
Financial Analysis & Trading Decisions	Trading Recommendation	8.75	ranked · n=86	$0.230470	-67%	✓
Infrastructure & Utility	claim_refinement autogenerated	6.66	medium · n=100	$0.014598	-602%	no
Infrastructure & Utility	image_prompt_generation autogenerated	8.57	ranked · n=100	$0.026798	best	✓
Infrastructure & Utility	LLM Prompt Adaptation	8.83	ranked · n=100	$0.073908	-67%	✓
Infrastructure & Utility	markdown_newline_repair autogenerated	1.50	low · n=1	$0.561430	—	no
Infrastructure & Utility	metadata_paragraph_improvement autogenerated	4.48	medium · n=90	$0.005370	-560%	no
Infrastructure & Utility	onboarding_chapter_prompt_generation autogenerated	8.98	high · n=75	$0.363502	16%	✓
Long-form Content Generation	author_soul_generation autogenerated	9.53	ranked · n=84	$0.119800	best	✓
Long-form Content Generation	Claim-Referenced Analyst Writing (pooled)	8.82	high · n=100	$0.087628	-67%	✓
Long-form Content Generation	onboarding_chapter_generation autogenerated	0.00	low · n=0	$0.160940	—	no
Long-form Content Generation	section_generation autogenerated	9.06	high · n=6	$0.031248	-547%	✓
Long-form Content Generation	Substack Newsletter (pooled)	9.51	ranked · n=5	$0.055020	best	✓
Long-form Content Generation	theme_generation autogenerated	0.00	low · n=0	$0.052740	—	no
Relevance, Classification & Matching	at_content_domain_suggest autogenerated	0.00	low · n=0	$0.014528	—	no
Relevance, Classification & Matching	Author Matching	7.17	high · n=100	$0.035810	-45%	no
Relevance, Classification & Matching	Relevance Scoring (POST)	0.00	low · n=0	$0.008750	-119%	no
Relevance, Classification & Matching	Relevance Scoring (Topic Report)	0.00	low · n=0	$0.008750	-5369%	no
Relevance, Classification & Matching	Relevance Scoring (X Post)	0.00	low · n=0	$0.008750	—	no
Relevance, Classification & Matching	subreddit_selection autogenerated	5.24	high · n=63	$0.019760	7%	no
Relevance, Classification & Matching	subreddit_vetting autogenerated	0.00	low · n=0	$0.030125	—	no
Relevance, Classification & Matching	topic_client_matching autogenerated	7.40	high · n=100	$0.096155	-3016%	no
Relevance, Classification & Matching	vetted_site_selection autogenerated	0.00	low · n=0	$0.115895	—	no
Relevance, Classification & Matching	x_post_selection autogenerated	8.31	high · n=100	$0.016140	2%	✓
Social & Promotional Content	activity_promo_generation autogenerated	8.65	medium · n=63	$0.011540	best	✓
Social & Promotional Content	auto_reddit_post_generation autogenerated	8.18	high · n=100	$0.307700	-513%	✓
Social & Promotional Content	Social Post Promo (pooled)	7.76	medium · n=94	$0.018040	-13165%	no
Social & Promotional Content	x_com_messages_for_promotion autogenerated	0.00	low · n=0	$0.161185	—	no
Content Summarization & Synthesis	content-summarization autogenerated	6.96	high · n=100	$0.043110	-2001%	no
Content Summarization & Synthesis	Direct Browse Content Synthesis	0.00	low · n=0	$0.012565	—	no
Content Summarization & Synthesis	executive_summary_generation autogenerated	8.74	low · n=2	$0.178540	—	no
Content Summarization & Synthesis	synthesis_of_titles_for_publication autogenerated	8.09	high · n=68	$0.030045	-199%	no
Topic Organization & Clustering	ps_section_reassignment autogenerated	0.00	low · n=0	$0.043610	—	no
Topic Organization & Clustering	Topic Discovery Clustering (pooled)	8.77	high · n=98	$0.305172	best	✓
Topic Organization & Clustering	topic_cluster_naming autogenerated	7.88	ranked · n=94	$0.033962	3%	no
Topic Organization & Clustering	topic_sequence_determination autogenerated	0.00	low · n=0	$0.034352	—	no