tinker_cookbook.exceptions.EvalTimeoutError
class tinker_cookbook.exceptions.EvalTimeoutError(EvalError, TimeoutError)
A single evaluation example exceeded its time limit.
Raised when asyncio.wait_for hits the per-example timeout
configured via BenchmarkConfig.timeout_seconds. The example
is scored as a failure (reward=0) with error="timeout (Ns)".
Timeout thresholds vary by benchmark type:
- Single-turn programmatic grading: 60–300s
- Single-turn with LLM judge: 300–600s
- Code execution in sandbox: 300–600s
- Multi-turn agent interaction: 600–1800s
Users can adjust via BenchmarkConfig.timeout_seconds. If a
benchmark frequently times out, consider increasing the timeout
rather than treating it as a bug — some benchmarks are inherently
slow (e.g., multi-turn SWE tasks on large codebases).