Benchmarks

Plus rapide. Moins cher. Vérifié.

Measured across 87 bug-fix tasks on identical checkouts: Codna verified 87/87 (100%) at ~$0.02 per fix — 5× fewer tokens and 1.7× faster than Cursor, and 6.6× cheaper than Cline (which verified only 73.6%).

MetricCodnaCursorCline
Fixes verified87/8787/8764/87
Accuracy100%100%73.6%
Avg wall time13.4s22.6s72.9s
Avg tokens / fix16.2K81.0K64.8K
Avg cost / fix$0.021n/a$0.139
Repo understanding0 tokensLLM contextLLM context
fewer tokens than Cursor, 1.7× faster.
$0.02per verified fix — 6.6× cheaper than Cline.
100%verified (87/87) vs Cline's 73.6%.

Méthodologie

De vrais bugs. De vrais dépôts. Des tests qui passent.

Le jeu de benchmarks comprend des bugs uniques, plusieurs bugs dans un même dépôt, et des dépôts jamais vus par l'agent. Chaque scénario n'est comptabilisé que lorsque le test réussit.

8

Scénarios de correction de bugs

Chaque exécution commence par un test échoué reproductible et se termine par une vérification.

130

Dépôts analysés

Codna a compris 130 dépôts dans 110 langages en 9,2 secondes.

162×

Moins de contexte

Codna envoie à l'agent un faisceau de preuves plutôt que le dépôt entier.

$0.02

Per verified fix

Measured spend to ship one verified fix — about 28× cheaper than a typical agentic edit.

Measured 2026-06-15 on identical isolated checkouts. Every scenario counts only when your own test passes. Download the raw dataset (JSON) →

Frequently asked

Codna ran head-to-head against OpenAI Codex CLI and Google Gemini CLI across 8 real bug-fix scenarios. Every fix was verified by the project's own tests. No synthetic tasks, no self-reported numbers.

A fix counts only when the test suite passes. Codna went 8 for 8. A fix that compiles but breaks tests is not counted.

Codna sends the AI agent an evidence bundle measured at roughly 600 tokens — 162x less context than reading the full repository. That translates directly to lower API costs, typically pennies per fix.

The deterministic engine maps the dependency and blast-radius graph in about 60 ms using zero LLM tokens. The agent receives only the relevant evidence, so there is far less to process before a fix is produced.

Yes. In measured testing, Codna mapped 130 repositories in 9.2 seconds, using zero tokens for the mapping step.

The benchmark compared Codna against the default configurations of OpenAI Codex CLI and Google Gemini CLI as available at time of testing. Codna is bring-your-own-key, so you choose the underlying model.

Vérifiez-le dans votre dépôt.

codna fix . --issue "the failing test"