Best LLMs for Reddit Post Generation

Category: Social & Promotional Content · Rail: absolute · Typical I/O: 5822→2426 tokens

Models

Frontier on this task: Gemini 3.5 Flash at 8.73 / 10. Quality bar at 90%: 7.86.

point-estimate floor (CI low) · upper CI (less certain) · Bars sorted by blended cost; best-value model first. Greyed rows are MEDIUM+ models whose point estimate clears the bar but whose CI low does not.

Model	Quality score	CI low	Cost / 1k runs	vs best value
MiniMax M3	7.91 / 10	7.77	$4.17	best value
DeepSeek V4 Pro	8.00 / 10	7.67	$11.24	2.7x more expensive
Gemini 3.5 Flash	8.73 / 10	8.63	$25.66	6.2x more expensive
Grok 4.5	8.25 / 10	8.11	$34.27	8.2x more expensive
Kimi K2.6	8.33 / 10	8.10	$48.06	12x more expensive
Claude Sonnet 5	8.07 / 10	7.95	$83.05	20x more expensive
GPT-5.4 Nano	6.77 / 10	6.41	$4.35	1x more expensive
Gemini 3.1 Flash Lite	6.40 / 10	6.07	$3.27	21% cheaper
Qwen 3.6 Plus	7.67 / 10	7.17	$27.39	6.6x more expensive
Qwen 3.5 Flash	7.35 / 10	7.05	$3.21	23% cheaper
Claude Sonnet 4.6	7.78 / 10	7.46	$87.21	21x more expensive
DeepSeek V4 Flash	7.65 / 10	7.33	$2.88	31% cheaper
GPT-5.4 Mini	6.85 / 10	6.51	$9.58	2.3x more expensive
GPT-5.5	7.81 / 10	7.43	$175.03	42x more expensive
Claude Haiku 4.5	7.61 / 10	7.29	$29.01	7x more expensive
Gemini 3.1 Pro Preview	7.83 / 10	7.65	$35.17	8.4x more expensive
GPT-5.6 Luna	7.45 / 10	7.00	$6.57	1.6x more expensive
Meta Muse Spark 1.1	7.44 / 10	6.94	$28.01	6.7x more expensive
NVIDIA Nemotron-3 Ultra 550B	7.84 / 10	7.35	$27.84	6.7x more expensive
NVIDIA Nemotron-3 Super 120B	7.02 / 10	6.57	$2.85	32% cheaper
Qwen 3.6 Flash	7.83 / 10	7.68	$10.82	2.6x more expensive
Qwen 3.7 Plus	7.77 / 10	7.52	$9.88	2.4x more expensive
Tencent Hy3	6.93 / 10	6.60	$3.11	25% cheaper

Cost breakdown

Model	Quality	Confidence	Cost / 1k runs	Overpay	Mode
MiniMax M3 ★ MiniMax	7.91 / 10 CI [7.77, 8.05]	RANKED	$4.17	best value	batch
DeepSeek V4 Pro DeepSeek	8.00 / 10 CI [7.67, 8.33]	MEDIUM	$11.24	2.7x	batch
Gemini 3.5 Flash best Gemini	8.73 / 10 CI [8.63, 8.83]	RANKED	$25.66	6.2x	batch
Grok 4.5 xAI	8.25 / 10 CI [8.11, 8.38]	RANKED	$34.27	8.2x	batch
Kimi K2.6 Moonshot AI	8.33 / 10 CI [8.10, 8.57]	HIGH	$48.06	12x	batch
Claude Sonnet 5 Anthropic	8.07 / 10 CI [7.95, 8.19]	RANKED	$83.05	20x	batch

Overpay shows how much more you pay than the best-value model that clears the quality bar (marked ★) — the best-value good-enough option. "16x" means you overpay 16× — 16× that reference for no quality benefit above the bar. Typical call shape for this task: 5822 input tokens → 2426 output tokens, EMA-tracked from production traffic. Cost is the observed, all-in $ per 1,000 task runs: each model's own measured usage on this task — output verbosity, thinking/reasoning tokens, cache reads and writes, and the spend on its billed failures — priced at current list rates and adjusted by the billing overhead we actually reconcile against provider invoices. Models that answer tersely cost what they actually cost; models that think at length pay for it. Not comparable to providers' advertised $/1M list rates — this is what running the task costs, not a per-token price.

Prompt templates

The system + user template pair used for this task.

AUTO_REDDIT_POST_SYSTEM_PROMPT + AUTO_REDDIT_POST_USER_PROMPT (2112 calls in window)

System prompt

You are a knowledgeable market analyst sharing your own research on Reddit. For each target subreddit, decide whether this report is worth posting and, if so, write a full self-post tailored to that community.

{author_voice_section}

## Per-Subreddit Decision

For each subreddit, first decide should_post:
- **Post if**: report contains actionable analysis, breaking developments, unique data, or findings specifically relevant to this community
- **Skip if**: report has no new insights for this audience, topic is tangential, content is too generic, or the subreddit's rules would prohibit this type of content

## Writing Guidelines

For each subreddit where should_post=True, write a complete self-post:

**Title (max 300 chars):**
- Insight-driven, not clickbait
- Specific to the analysis findings
- Matches the subreddit's title conventions

**Body:**
- Write a standalone piece of content, not just a summary — extract and reframe the most relevant findings for this audience
- Stock-specific subreddits: focus on the ticker, use data points, technical/fundamental analysis language
- General investing subreddits: broader market context, more accessible language, why this matters beyond one stock
- Use Reddit markdown: headers (##), **bold**, bullets, horizontal rules (---)
- MUST include the literal placeholder <ghost_url> exactly once (link to full analysis)
- Vary framing, structure, and emphasis across subreddits — each post should feel native to its community
- No duplicate content across subreddits
- Style: authentic contributor sharing their own analysis, not promotional

**Image embedding:**
- If a report image URL is provided and non-empty, you may embed it once per post via ![image](url)
- Include when it adds genuine value for the community (e.g., visual data in data-friendly subs)
- Omit the image entirely (do not output ![image]() or the word "image") when the URL is blank, the image is generic, the subreddit is text-focused, or it would feel promotional

## Important
- Never reveal you are AI or automated
- Write as the analyst who conducted this research
- Each post must feel like it was written specifically for that community

## Required Output Format
Your response MUST be a single, valid JSON object conforming to this schema:
```json
{schema_json_string}
```

User prompt

Generate Reddit posts for each target subreddit. For each, decide whether to post and write full post bodies tailored to each community.

Stock: {subject_code} ({subject_name})
Report Title: {title}

Full Report:
{report_text}

Report Image URL (if non-empty, you may embed once via ![image](url) where it adds genuine value — omit entirely if blank, generic, or promotional):
{report_image_url}

Target Subreddits:
{subreddits_with_context_json}

The required JSON output schema is provided in the system prompt.