Skip to content

tinker_cookbook.supervised.FromConversationFileBuilder

class tinker_cookbook.supervised.FromConversationFileBuilder(ChatDatasetBuilder)

Build a supervised dataset from a JSONL file of chat conversations.

Each line of the file must be a JSON object with a "messages" key whose value is a list of chat messages (dicts with "role" and "content").

builder = FromConversationFileBuilder(
file_path="data/conversations.jsonl",
test_size=50,
common_config=ChatDatasetBuilderCommonConfig(
model_name_for_tokenizer="Qwen/Qwen3-8B",
renderer_name="qwen3",
max_length=2048,
batch_size=8,
),
)
train_ds, test_ds = builder()

Fields:

__call__()

Load the JSONL file and return (train_dataset, test_dataset).

Returns: tuple[SupervisedDataset, SupervisedDataset | None] – Training dataset and an optional held-out evaluation dataset.

Raises:

  • DataFormatError: If any line in the file lacks a "messages" key.