tinker_cookbook.stores.TrainingRunStore
class tinker_cookbook.stores.TrainingRunStore()
Typed read/write access to one training run's data.
All file I/O goes through the Storage protocol — no direct
Path/open() usage. Pickle-serializable when freshly
constructed (lazy reader init).
url(path)
Return a human-readable URI for a path within this run.
Useful for logging in distributed workers::
logger.info("Writing metrics to %s", store.url("metrics.jsonl"))
Parameters:
- path (str)
Returns: str
read_config()
Read config.json (cached after first read).
Returns: dict[str, Any] | None
read_metrics()
Read all metrics (incremental — only new data from disk).
Returns: list[dict[str, Any]]
read_new_metrics()
Read only metrics added since last call.
Returns: list[dict[str, Any]]
metric_keys()
All metric keys seen so far (excluding 'step').
Returns: set[str]
read_rollouts(iteration, base_name)
Read rollout summaries for an iteration as raw dicts.
Parameters:
- iteration (int) – Training iteration number.
- base_name (str) – Prefix for the JSONL file (e.g.
"train","eval_gsm8k"). Matches the naming used byrollout_summaries_jsonl_path()in RL training.
Returns: list[dict[str, Any]]
read_single_rollout(iteration, group_idx, traj_idx, base_name)
Find one rollout by group and trajectory index, or None.
Parameters:
Returns: dict[str, Any] | None
read_checkpoints()
Read checkpoints.jsonl.
Returns: list[dict[str, Any]]
read_checkpoint_records()
Read checkpoints.jsonl as CheckpointRecord objects.
Returns: list[Any]
read_timing()
Read all timing records (incremental — only new data from disk).
Returns: list[dict[str, Any]]
read_logtree(iteration, base_name)
Read a logtree JSON file for an iteration, or None if missing.
Parameters:
Returns: dict[str, Any] | None
list_logtrees(iteration)
List logtree base names for an iteration (e.g. ["train", "eval_gsm8k"]).
Parameters:
- iteration (int)
Returns: list[str]
list_iterations()
List all iteration directories with metadata about their contents.
Returns: list[IterationInfo]
write_config(config)
Write config.json (overwrites if exists, updates cache).
Parameters:
- config (dict[str, Any])
Returns: None
write_metrics(metrics, step)
Append one metrics record to metrics.jsonl.
The record is {"step": step, ...metrics} if step is given,
otherwise just the metrics dict.
Parameters:
Returns: None
write_timing_spans(step, spans)
Append one timing record to timing_spans.jsonl.
Each span dict should have keys: name, duration,
wall_start, wall_end.
Parameters:
Returns: None
write_checkpoint(record)
Append one checkpoint record to checkpoints.jsonl.
Accepts a raw dict (e.g. from CheckpointRecord.to_dict()).
Must contain at least a "name" key.
Parameters:
- record (dict[str, Any])
Returns: None
write_rollouts(iteration, records, base_name)
Write rollout summaries for an iteration (overwrites).
Parameters:
- iteration (int) – Training iteration number.
- records (list[dict[str, Any]]) – List of trajectory dicts to write.
- base_name (str) – Prefix for the JSONL file (e.g.
"train","eval_gsm8k"). Must match thebase_nameused inread_rollouts().
Returns: None
write_logtree(iteration, data, base_name)
Write a logtree JSON file for an iteration (overwrites).
Parameters:
Returns: None
write_code_diff(diff)
aread_config()
Async version of read_config.
Returns: dict[str, Any] | None
aread_metrics()
Async version of read_metrics.
Returns: list[dict[str, Any]]
aread_new_metrics()
Async version of read_new_metrics.
Returns: list[dict[str, Any]]
aread_rollouts(iteration, base_name)
Async version of read_rollouts.
Parameters:
Returns: list[dict[str, Any]]
aread_checkpoints()
Async version of read_checkpoints.
Returns: list[dict[str, Any]]
aread_timing()
Async version of read_timing.
Returns: list[dict[str, Any]]
aread_logtree(iteration, base_name)
Async version of read_logtree.
Parameters:
Returns: dict[str, Any] | None
awrite_metrics(metrics, step)
Async version of write_metrics.
Parameters:
Returns: None