Skip to content

tinker_cookbook.supervised.ChatDatasetBuilder

class tinker_cookbook.supervised.ChatDatasetBuilder(SupervisedDatasetBuilder)

Build a chat dataset that uses a renderer to tokenise message lists.

Subclasses must implement __call__ to return the concrete datasets.

Fields:

__call__()

Build and return (train_dataset, eval_dataset).

Returns: tuple[SupervisedDataset, SupervisedDataset | None] – Training dataset and an optional evaluation dataset.

property tokenizer

Get the tokenizer for this dataset's model.

Returns: Tokenizer

property renderer

Get the renderer for this dataset's model.

Returns: renderers.Renderer