Skip to content

tinker_cookbook.rl.compute_advantages

tinker_cookbook.rl.compute_advantages(trajectory_groups_P)

Compute advantages for each trajectory, centered within groups.

Parameters:

Returns: list[torch.Tensor] – Per-group advantage tensors of shape (G,), where G is the number of trajectories in each group.