Sarvam AI Multi-Layer Nested Hypothetical Benchmark | Sarvam AI

Back to all reports

Scope

This benchmark tests four concurrent abstraction layers:

Physical constraints
System architecture
Policy guarantees
Temporary executive override

Score

Overall: 8.8 / 10

What It Did Well

Preserved layer boundaries across all cases
Correctly reasoned temporary override windows
Correctly identified duplicate-side-effect contradictions

Where It Can Improve

Deeper hierarchy analysis for override vs policy guarantees
More explicit treatment of irreversible ordering effects

Verdict

One of the strongest runs. Layered reasoning is stable and resilient under nested hypotheticals.