<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>DTP LLM Benchmark</title><link>https://llm-bench.kapualabs.com/</link><description>Recent content on DTP LLM Benchmark</description><generator>Hugo</generator><language>en</language><lastBuildDate>Thu, 21 May 2026 17:20:36 +0000</lastBuildDate><atom:link href="https://llm-bench.kapualabs.com/index.xml" rel="self" type="application/rss+xml"/><item><title>DTP LLM Benchmark launches</title><link>https://llm-bench.kapualabs.com/journal/launch/</link><pubDate>Thu, 21 May 2026 17:20:36 +0000</pubDate><guid>https://llm-bench.kapualabs.com/journal/launch/</guid><description>&lt;p&gt;We&amp;rsquo;re launching the DTP LLM Benchmark — independent rankings of leading LLMs
on real production tasks from our financial-analyst pipeline at Nova.&lt;/p&gt;
&lt;p&gt;The premise is simple: most LLM tasks have a quality &lt;em&gt;threshold&lt;/em&gt;, not a quality
&lt;em&gt;maximum&lt;/em&gt;. The best-performing model is overkill for most steps. We show you which
models actually clear your quality bar on each task, and rank them by cost
ascending. The cheapest qualifier wins — not the highest-scoring one.&lt;/p&gt;</description></item><item><title>About — DTP LLM Benchmark</title><link>https://llm-bench.kapualabs.com/about/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/about/</guid><description>&lt;h2 id="what-this-is"&gt;What this is&lt;/h2&gt;
&lt;p&gt;An independent, transparent LLM benchmark scoped to the work of building
LLM-backed business processes. Every score on this site is derived from
production workloads run by Capua Labs through DTP — not from synthetic
test sets, not from customer data, not from self-reported model numbers.&lt;/p&gt;
&lt;h2 id="methodology"&gt;Methodology&lt;/h2&gt;
&lt;p&gt;Full methodology lives at &lt;a href="https://llm-bench.kapualabs.com/methodology/"&gt;/methodology/&lt;/a&gt; — what the
quality score is, how confidence is gated, what fan-out and judging
means, why the slider default moves week to week.&lt;/p&gt;</description></item><item><title>Best LLMs for activity_promo_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/activity_promo_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/activity_promo_generation/</guid><description/></item><item><title>Best LLMs for at_content_domain_suggest autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/at_content_domain_suggest/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/at_content_domain_suggest/</guid><description/></item><item><title>Best LLMs for Author Matching — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/author_matching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/author_matching/</guid><description/></item><item><title>Best LLMs for author_living_check autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/author_living_check/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/author_living_check/</guid><description/></item><item><title>Best LLMs for author_soul_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/author_soul_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/author_soul_generation/</guid><description/></item><item><title>Best LLMs for auto_reddit_post_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/auto_reddit_post_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/auto_reddit_post_generation/</guid><description/></item><item><title>Best LLMs for claim_extraction autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/claim_extraction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/claim_extraction/</guid><description/></item><item><title>Best LLMs for claim_refinement autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/claim_refinement/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/claim_refinement/</guid><description/></item><item><title>Best LLMs for Claim-Referenced Analyst Writing (pooled) — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/claim_referenced_analyst_writing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/claim_referenced_analyst_writing/</guid><description/></item><item><title>Best LLMs for Content Summarization &amp; Synthesis — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/summarization_and_synthesis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/summarization_and_synthesis/</guid><description/></item><item><title>Best LLMs for content-summarization autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/content-summarization/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/content-summarization/</guid><description/></item><item><title>Best LLMs for Direct Browse Content Synthesis — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/direct-browse-content-synthesis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/direct-browse-content-synthesis/</guid><description/></item><item><title>Best LLMs for executive_summary_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/executive_summary_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/executive_summary_generation/</guid><description/></item><item><title>Best LLMs for Financial Analysis &amp; Trading Decisions — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/financial_analysis_and_trading/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/financial_analysis_and_trading/</guid><description/></item><item><title>Best LLMs for Generic TOC Extraction — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/generic-toc-extraction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/generic-toc-extraction/</guid><description/></item><item><title>Best LLMs for image_prompt_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/image_prompt_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/image_prompt_generation/</guid><description/></item><item><title>Best LLMs for Infrastructure &amp; Utility — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/infrastructure_and_utility/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/infrastructure_and_utility/</guid><description/></item><item><title>Best LLMs for Language Detection — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/language-detection/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/language-detection/</guid><description/></item><item><title>Best LLMs for LLM Prompt Adaptation — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/prompt-adaptation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/prompt-adaptation/</guid><description/></item><item><title>Best LLMs for Long-form Content Generation — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/long_form_writing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/long_form_writing/</guid><description/></item><item><title>Best LLMs for markdown_newline_repair autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/markdown_newline_repair/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/markdown_newline_repair/</guid><description/></item><item><title>Best LLMs for metadata_paragraph_improvement autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/metadata_paragraph_improvement/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/metadata_paragraph_improvement/</guid><description/></item><item><title>Best LLMs for onboarding_chapter_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/onboarding_chapter_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/onboarding_chapter_generation/</guid><description/></item><item><title>Best LLMs for onboarding_chapter_prompt_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/onboarding_chapter_prompt_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/onboarding_chapter_prompt_generation/</guid><description/></item><item><title>Best LLMs for onboarding_prospect_analysis autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/onboarding_prospect_analysis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/onboarding_prospect_analysis/</guid><description/></item><item><title>Best LLMs for ps_section_reassignment autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/ps_section_reassignment/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/ps_section_reassignment/</guid><description/></item><item><title>Best LLMs for query_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/query_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/query_generation/</guid><description/></item><item><title>Best LLMs for query_validation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/query_validation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/query_validation/</guid><description/></item><item><title>Best LLMs for region_identification autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/region_identification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/region_identification/</guid><description/></item><item><title>Best LLMs for Relevance Scoring (POST) — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/relevance_scoring_post/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/relevance_scoring_post/</guid><description/></item><item><title>Best LLMs for Relevance Scoring (Topic Report) — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/relevance_scoring_topic_report/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/relevance_scoring_topic_report/</guid><description/></item><item><title>Best LLMs for Relevance Scoring (X Post) — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/relevance_scoring_x_post/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/relevance_scoring_x_post/</guid><description/></item><item><title>Best LLMs for Relevance, Classification &amp; Matching — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/relevance_and_classification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/relevance_and_classification/</guid><description/></item><item><title>Best LLMs for report_image_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/report_image_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/report_image_generation/</guid><description/></item><item><title>Best LLMs for S1 TOC extraction — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/s1-toc-extraction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/s1-toc-extraction/</guid><description/></item><item><title>Best LLMs for SEC Filling Analysis — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/sec-filling-analysis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/sec-filling-analysis/</guid><description/></item><item><title>Best LLMs for sec-s1-chunk-analysis autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/sec-s1-chunk-analysis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/sec-s1-chunk-analysis/</guid><description/></item><item><title>Best LLMs for section_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/section_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/section_generation/</guid><description/></item><item><title>Best LLMs for Social &amp; Promotional Content — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/social_and_promotional/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/social_and_promotional/</guid><description/></item><item><title>Best LLMs for Social Post Promo (pooled) — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/social_post_promo/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/social_post_promo/</guid><description/></item><item><title>Best LLMs for Structured Data &amp; Fact Extraction — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/extraction_and_parsing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/extraction_and_parsing/</guid><description/></item><item><title>Best LLMs for structured_output_extraction autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/structured_output_extraction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/structured_output_extraction/</guid><description/></item><item><title>Best LLMs for subreddit_selection autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/subreddit_selection/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/subreddit_selection/</guid><description/></item><item><title>Best LLMs for subreddit_vetting autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/subreddit_vetting/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/subreddit_vetting/</guid><description/></item><item><title>Best LLMs for Substack Newsletter (pooled) — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/substack_newsletter/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/substack_newsletter/</guid><description/></item><item><title>Best LLMs for synthesis_analysis autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/synthesis_analysis/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/synthesis_analysis/</guid><description/></item><item><title>Best LLMs for synthesis_of_titles_for_publication autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/synthesis_of_titles_for_publication/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/synthesis_of_titles_for_publication/</guid><description/></item><item><title>Best LLMs for theme_generation autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/theme_generation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/theme_generation/</guid><description/></item><item><title>Best LLMs for Topic Discovery Clustering (pooled) — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/topic_discovery_clustering/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/topic_discovery_clustering/</guid><description/></item><item><title>Best LLMs for Topic Organization &amp; Clustering — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/category/topic_organization/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/category/topic_organization/</guid><description/></item><item><title>Best LLMs for topic_client_matching autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/topic_client_matching/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/topic_client_matching/</guid><description/></item><item><title>Best LLMs for topic_cluster_naming autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/topic_cluster_naming/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/topic_cluster_naming/</guid><description/></item><item><title>Best LLMs for topic_clustering_assign_sections autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/topic_clustering_assign_sections/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/topic_clustering_assign_sections/</guid><description/></item><item><title>Best LLMs for topic_sequence_determination autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/topic_sequence_determination/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/topic_sequence_determination/</guid><description/></item><item><title>Best LLMs for Trading Recommendation — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/trading_recommendation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/trading_recommendation/</guid><description/></item><item><title>Best LLMs for Translation — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/translation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/translation/</guid><description/></item><item><title>Best LLMs for vetted_site_selection autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/vetted_site_selection/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/vetted_site_selection/</guid><description/></item><item><title>Best LLMs for x_com_messages_for_promotion autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/x_com_messages_for_promotion/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/x_com_messages_for_promotion/</guid><description/></item><item><title>Best LLMs for x_post_selection autogenerated — DTP Benchmark</title><link>https://llm-bench.kapualabs.com/task/x_post_selection/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/task/x_post_selection/</guid><description/></item><item><title>Claude Opus 4.7 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/claude-opus-47/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/claude-opus-47/</guid><description/></item><item><title>Claude Sonnet 4.6 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/claude-sonnet-46/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/claude-sonnet-46/</guid><description/></item><item><title>DeepSeek V4 Flash on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/deepseek-v4-flash/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/deepseek-v4-flash/</guid><description/></item><item><title>DeepSeek V4 Pro on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/deepseek-v4-pro/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/deepseek-v4-pro/</guid><description/></item><item><title>Gemini 3 Flash Preview on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gemini-3-flash-preview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gemini-3-flash-preview/</guid><description/></item><item><title>Gemini 3 Pro Image Preview on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gemini-3-pro-image-preview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gemini-3-pro-image-preview/</guid><description/></item><item><title>Gemini 3.1 Flash Image Preview on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gemini-31-flash-image-preview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gemini-31-flash-image-preview/</guid><description/></item><item><title>Gemini 3.1 Flash Lite on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gemini-31-flash-lite/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gemini-31-flash-lite/</guid><description/></item><item><title>Gemini 3.1 Pro Preview on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gemini-31-pro-preview/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gemini-31-pro-preview/</guid><description/></item><item><title>GPT-5.4 mini on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gpt-54-mini/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gpt-54-mini/</guid><description/></item><item><title>GPT-5.4 nano on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gpt-54-nano/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gpt-54-nano/</guid><description/></item><item><title>GPT-5.5 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gpt-55/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gpt-55/</guid><description/></item><item><title>GPT-image-1.5 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/gpt-image-15/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/gpt-image-15/</guid><description/></item><item><title>Haiku 4.5 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/haiku-45/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/haiku-45/</guid><description/></item><item><title>Imagen 4.0 Fast on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/imagen-40-fast/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/imagen-40-fast/</guid><description/></item><item><title>Imagen 4.0 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/imagen-40/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/imagen-40/</guid><description/></item><item><title>Imagen 4.0 Ultra on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/imagen-40-ultra/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/imagen-40-ultra/</guid><description/></item><item><title>Kimi K2.6 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/kimi-k26/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/kimi-k26/</guid><description/></item><item><title>Methodology — DTP LLM Benchmark</title><link>https://llm-bench.kapualabs.com/methodology/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/methodology/</guid><description>&lt;h2 id="what-this-site-does"&gt;What this site does&lt;/h2&gt;
&lt;p&gt;Most production LLM tasks have a quality &lt;em&gt;threshold&lt;/em&gt;, not a quality
&lt;em&gt;maximum&lt;/em&gt;. Once a model clears the bar, additional capability is unused — you&amp;rsquo;re paying for headroom your pipeline doesn&amp;rsquo;t exercise. This site
shows you which models clear your quality bar on each step of your
pipeline, and ranks them by cost ascending so you can pick the
right-sized model rather than reaching for the best-performing model by default.&lt;/p&gt;</description></item><item><title>MiniMax M2.5 on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/minimax-m25/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/minimax-m25/</guid><description/></item><item><title>Qwen 3.5 Flash on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/qwen-35-flash/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/qwen-35-flash/</guid><description/></item><item><title>Qwen 3.6 Plus on DTP Benchmark</title><link>https://llm-bench.kapualabs.com/model/qwen-36-plus/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/model/qwen-36-plus/</guid><description/></item><item><title>Subscribe — DTP LLM Benchmark</title><link>https://llm-bench.kapualabs.com/subscribe/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://llm-bench.kapualabs.com/subscribe/</guid><description>&lt;h2 id="subscribe-to-the-journal"&gt;Subscribe to the journal&lt;/h2&gt;
&lt;p&gt;Once a week, we publish a short writeup of what changed in the snapshot —
new models that landed, rank shifts on tasks that matter, what the
slider&amp;rsquo;s default movement implies. Occasional deeper dives on
methodology, judge mechanics, and what we&amp;rsquo;re learning about model
selection at scale.&lt;/p&gt;
&lt;p&gt;No marketing, no upsells. You can unsubscribe at any time.&lt;/p&gt;
&lt;form
 action="https://buttondown.com/api/emails/embed-subscribe/dtp-llm-bench"
 method="post"
 target="popupwindow"
 onsubmit="window.open('https://buttondown.com/dtp-llm-bench', 'popupwindow')"
 class="subscribe-form"
&gt;
 &lt;label for="bd-email" class="visually-hidden"&gt;Email&lt;/label&gt;
 &lt;input
 type="email"
 name="email"
 id="bd-email"
 placeholder="you@example.com"
 required
 &gt;
 &lt;input type="hidden" name="tag" value="bench-site"&gt;
 &lt;button type="submit"&gt;Subscribe&lt;/button&gt;
&lt;/form&gt;


&lt;p&gt;Already use RSS? Same content lives at
&lt;a href="https://llm-bench.kapualabs.com/journal/index.xml"&gt;/journal/index.xml&lt;/a&gt;.&lt;/p&gt;</description></item></channel></rss>