tinker_cookbook.supervised.ChatDatasetBuilder
class tinker_cookbook.supervised.ChatDatasetBuilder(SupervisedDatasetBuilder)
Build a chat dataset that uses a renderer to tokenise message lists.
Subclasses must implement __call__ to return the concrete datasets.
Fields:
__call__()
Build and return (train_dataset, eval_dataset).
Returns: tuple[SupervisedDataset, SupervisedDataset | None] – Training dataset and an optional evaluation dataset.
property tokenizer
Get the tokenizer for this dataset's model.
Returns: Tokenizer
property renderer
Get the renderer for this dataset's model.
Returns: renderers.Renderer