Models & Pricing

All prices are per million tokens.

All Types

Base

Instruction

Reasoning

Hybrid

Vision

All Architectures

Dense

MoE

All Sizes

Compact

Small

Medium

Large

Model	Tinker ID	Type	Arch	Size	Context	Prefill	Sample	Train

Pricing Terms

Prefill: Processing input/prompt tokens (forward pass only)
Sample: Generating output tokens (forward pass + sampling)
Train: Forward and backward pass for gradient computation
Context: Maximum sequence length. Models with :peft: suffix support extended context at higher prices.
Tinker ID: The exact string to pass to create_lora_training_client(base_model=...) or create_sampling_client(base_model=...)

MoE models are priced by active parameters, making them significantly more cost-effective than dense models of similar quality.

For the latest pricing, see the Tinker Console.