Skip to content

tinker_cookbook.preference.Config

class tinker_cookbook.preference.Config()

Configuration for Direct Preference Optimization (DPO) training.

This is a chz dataclass that holds all hyperparameters, infrastructure settings, and checkpointing options for a DPO training run.

config = Config(
log_path="~/logs/dpo_run",
model_name="meta-llama/Llama-3.1-8B-Instruct",
dataset_builder=my_dpo_dataset_builder,
dpo_beta=0.1,
learning_rate=1e-5,
)
main(config)

Fields: