Skip to content

tinker_cookbook.exceptions.EvalTimeoutError

class tinker_cookbook.exceptions.EvalTimeoutError(EvalError, TimeoutError)

A single evaluation example exceeded its time limit.

Raised when asyncio.wait_for hits the per-example timeout configured via BenchmarkConfig.timeout_seconds. The example is scored as a failure (reward=0) with error="timeout (Ns)".

Timeout thresholds vary by benchmark type:

  • Single-turn programmatic grading: 60–300s
  • Single-turn with LLM judge: 300–600s
  • Code execution in sandbox: 300–600s
  • Multi-turn agent interaction: 600–1800s

Users can adjust via BenchmarkConfig.timeout_seconds. If a benchmark frequently times out, consider increasing the timeout rather than treating it as a bug — some benchmarks are inherently slow (e.g., multi-turn SWE tasks on large codebases).