tinker_cookbook.rl.RolloutStrategy
class tinker_cookbook.rl.RolloutStrategy(ABC)
Controls how trajectories are collected from a group of environments.
Subclasses implement execute which receives the
EnvGroupBuilder and a policy, creates envs, runs rollouts,
and returns the surviving trajectories plus any error info.
Implementations must be pickleable — use @dataclass(frozen=True)
with only primitive fields.
property catches_group_errors
If True, group-level errors (make_envs, compute_group_rewards)
are caught and the group is skipped. If False, they propagate.
Returns: bool
execute(env_group_builder, policy)
Create envs, run rollouts, and return results.
May raise on unrecoverable errors (e.g. retry budget exhausted).
The caller (do_group_rollout) handles group-level error
recovery based on catches_group_errors.
Parameters:
- env_group_builder (EnvGroupBuilder) – Builder used to create the environments for this rollout group.
- policy (TokenCompleter) – The policy (language model) used to generate actions during rollouts.
Returns: RolloutResult – The collected trajectories, surviving environments, and any errors encountered.
Abstract method.