Benchmark Status: This evaluation is based on the Sarvam AI web version only, from a single prompt run with no re-prompting. It has not been validated in real projects and remains incomplete until stable API access is available.

Back to all reports

Scope

Fifteen coding problems were solved in Python only. The run targeted algorithmic reasoning, structure design, and complexity discipline.

Score

  • Final: 135.5 / 150
  • Average: 9.03 / 10

Strong Areas

  • Graph problems (state modeling and pruning)
  • DP tasks (regex, product subarray, staircase)
  • Monotonic stack boundaries
  • LRU cache implementation without shortcuts

Gaps

  • One reset bug before correction (max product)
  • QuickSelect edge fragility
  • No C++/Rust/Go/Java parity check

Complexity Profile

The solutions were typically optimal or near-optimal and avoided brute-force fallbacks.

Verdict

Strong Python algorithmic competency. This benchmark does not prove systems-level, multi-language production readiness.