Skip to content

tinker_cookbook.rl.assemble_training_data

tinker_cookbook.rl.assemble_training_data(trajectory_groups_P, advantages_P)

Convert trajectories to training data format.

Parameters:

Returns: tuple[list[tinker.Datum], list[dict[str, int]]] – A flat list of training datums and a parallel list of metadata dicts mapping each datum back to its group_idx and traj_idx.