tinker_cookbook.rl.ProblemGroupBuilder
class tinker_cookbook.rl.ProblemGroupBuilder(EnvGroupBuilder)
Builds a group of ProblemEnv instances from a factory callable.
Fields:
- env_thunk (Callable[[], ProblemEnv])
- num_envs (int)
- dataset_name (str, default:
'problems')
make_envs()
Create num_envs ProblemEnv instances using the factory callable.
Returns: Sequence[Env]
compute_group_rewards(trajectory_group, env_group)
Return zero group rewards (all rewards come from per-step scoring).
Parameters:
- trajectory_group (list[Trajectory])
- env_group (Sequence[Env])
Returns: list[tuple[float, Metrics]]