tinker_cookbook.rl.StepResult

class tinker_cookbook.rl.StepResult()

Result returned by Env.step.

Fields:

reward (float) – Immediate reward for this step.
episode_done (bool) – Whether the episode has ended.
next_observation (Observation) – Observation for the next step (or final observation if episode_done).
next_stop_condition (StopCondition) – Stop condition for the next generation.
metrics (Metrics, default: field(default_factory=dict)) – Numeric values aggregated and reported in training logs (e.g., timing, counts).
logs (Logs, default: field(default_factory=dict)) – Diagnostic info for display/debugging tools (not aggregated like metrics).