Best LLMs for S1 TOC extraction — DTP Benchmark
Still in bootstrap
No model has reached MEDIUM confidence on this task yet. Rankings will appear here once at least one model accumulates enough judge evidence to clear the gate.
No model has reached MEDIUM confidence on this task yet. Rankings will appear here once at least one model accumulates enough judge evidence to clear the gate.