tinker_cookbook.rl.StepResult
class tinker_cookbook.rl.StepResult()
Result returned by Env.step.
Fields:
- reward (float) – Immediate reward for this step.
- episode_done (bool) – Whether the episode has ended.
- next_observation (Observation) – Observation for the next step (or final observation if episode_done).
- next_stop_condition (StopCondition) – Stop condition for the next generation.
- metrics (Metrics, default:
field(default_factory=dict)) – Numeric values aggregated and reported in training logs (e.g., timing, counts). - logs (Logs, default:
field(default_factory=dict)) – Diagnostic info for display/debugging tools (not aggregated like metrics).