tinker_cookbook.rl.ActionExtra
class tinker_cookbook.rl.ActionExtra(TypedDict)
Extra metadata passed alongside an action to Env.step.
All fields are optional so that callers and env implementations can ignore keys they don't care about. Values must be picklable (the rollout executor may serialise them across process boundaries).
Fields:
- stop_reason (tinker.StopReason) – Why sampling stopped —
"stop"(hit a stop sequence) or"length"(hit max_tokens without a stop sequence).