Sarvam AI Multi-Step Logic Stress Benchmark | Sarvam AI

Back to all reports

Scope

This stress test focuses on multi-constraint logic with self-reference and end-state validation pressure.

Score

Overall: 5.8 / 10

Observed Behavior

Local steps are often valid.
Global state checks are inconsistent.
Recursive truth setups cause instability.

Typical Failure Modes

Constraint retention loss
Missed automatic rule activation
Arithmetic or assignment checks skipped at final pass

Verdict

The model can reason linearly but needs stronger whole-system verification before final answers.