Sarvam AI Silent Inconsistency Injection Benchmark | Sarvam AI

Back to all reports

Scope

The model must detect injected inconsistencies without direct hints and decide whether design claims remain structurally valid.

Score

Overall: 7.8 / 10
Contradictions detected: 8 / 10

Strength Pattern

Strong under explicit architectural structure
Good semantic reasoning around ordering and retries

Gap Pattern

Real-time bounds and resource pressure analysis are less reliable
Some answers stay at guarantee-level language instead of enforceability-level proof

Verdict

Good defensive reasoning, but not yet expert in hard constraint or infra-feasibility analysis.