The Globe

Karpathy’s March of Nines reveals why 90% AI reliability isn’t even near sufficient

Source link : https://tech365.info/karpathys-march-of-nines-reveals-why-90-ai-reliability-isnt-even-near-sufficient/

“When you get a demo and something works 90% of the time, that’s just the first nine.” — Andrej Karpathy

The “March of Nines” frames a standard manufacturing actuality: You may attain the primary 90% reliability with a robust demo, and every further 9 typically requires comparable engineering effort. For enterprise groups, the gap between “usually works” and “operates like dependable software” determines adoption.

The compounding math behind the March of Nines

“Every single nine is the same amount of work.” — Andrej Karpathy

Agentic workflows compound failure. A typical enterprise stream would possibly embody: intent parsing, context retrieval, planning, a number of instrument calls, validation, formatting, and audit logging. If a workflow has n steps and every step succeeds with likelihood p, end-to-end success is roughly p^n.

In a 10-step workflow, the end-to-end success compounds because of the failures of every step. Correlated outages (auth, fee limits, connectors) will dominate except you harden shared dependencies.

Per-step success (p)

10-step success (p^10)

Workflow failure fee

At 10 workflows/day

What does this imply in observe

90.00%

34.87%

65.13%

~6.5 interruptions/day

Prototype territory. Most workflows get interrupted

99.00%

90.44%

9.56%

~1 each 1.0 days

Tremendous for a demo, however interruptions are nonetheless frequent in actual use.

99.90%

99.00%

1.00%

~1 each 10.0 days

Nonetheless feels unreliable as a result of misses stay…

—-

Author : tech365

Publish date : 2026-03-08 20:34:00

Copyright for syndicated content belongs to the linked Source.

—-

12345678

Exit mobile version