EvSys

sdft_data

SDFT data shaping - port the cookbook's distillation math, minus orchestration. Pure functions over tinker types - no I/O, no tinker_cookbook imports.

The Self-Distillation Fine-Tuning algorithm (Shenfeld et al., 2026):

  1. Student rollout - sample from the live student weights on the user question (no demo). Each sample is a completion + per-position logprobs.
  2. Teacher prompt - build a frozen-teacher prompt that contains the golden answer as an in-context demonstration.
  3. Teacher topK - append the student completion to the teacher prompt, ask the teacher for its top-K token distribution at each completion position via tinker's topk_prompt_logprobs sampling API.
  4. CE distillation - train the student to match the teacher's renormalized top-K distribution at each position via cross_entropy.

This module owns Step 4's data shaping (turning teacher responses into tinker.Datum objects with (N, K)-shaped target_tokens + weights) plus the teacher-prompt helper for Step 2. The :class:~evsys_sdk.algorithms.sdft.SDFT algorithm orchestrates 1-4.

attributelogger
= logging.getLogger(__name__)
attributeDEFAULT_DEMO_TEMPLATE
= '{question}\n\nThis is an example for a response to the question:\n{golden_answer}\n\nNow answer with a response of your own, including the thinking process.'

The cookbook's default demonstration template. Researchers usually override this in the algorithm config (e.g. to bracket the golden answer in \<answer>...\</answer> tags).

attribute__all__
= ['CompletionSlice', 'DEFAULT_DEMO_TEMPLATE', 'SDFTDataset', 'SimpleSDFTDataset', 'build_teacher_forced_sequence', 'build_teacher_prompt', 'build_topk_targets', 'extract_completion_tokens', 'student_datum_from_rollout']
funcbuild_teacher_prompt(*, question, golden_answer, tokenizer, system_prompt=None, demo_template=DEFAULT_DEMO_TEMPLATE, enable_thinking=None) -> tinker.ModelInput

Render the teacher prompt (system + user-with-demo) → ModelInput.

The teacher gets to see the golden answer as a soft hint in the user turn (via demo_template). Student completions then get appended to this prompt for teacher-forced top-K scoring.

We use :func:~evsys_sdk.training.templates.messages_to_model_input rather than the cookbook's Renderer.build_generation_prompt - same end shape (HF chat-template applied with add_generation_prompt=True), no Renderer class hierarchy needed.

paramquestionstr
paramgolden_answerstr
paramtokenizerAny
paramsystem_promptstr | None
= None
paramdemo_templatestr
= DEFAULT_DEMO_TEMPLATE
paramenable_thinkingbool | None
= None

Returns

tinker.tinker.ModelInput
funcstudent_datum_from_rollout(*, prompt, completion_tokens) -> tinker.Datum

Wrap a student rollout (prompt + sampled completion) as a tinker.Datum.

The Datum is what :func:build_topk_targets consumes: it carries the full sequence as model_input and a per-position mask indicating which positions are completion tokens (the ones the teacher scores).

Position alignment (matches the cookbook convention):

  • model_input covers prompt + completion[:-1] (left-shifted)
  • target_tokens covers completion (the loss targets)
  • mask is 1 on completion positions and 0 on prompt positions
paramprompttinker.ModelInput
paramcompletion_tokensSequence[int]

Returns

tinker.tinker.Datum
funcextract_completion_tokens(datum, teacher_prompt_len, *, max_context_length) -> CompletionSlice

Pull the completion tokens off a student Datum, truncating if teacher_prompt + completion would exceed max_context_length.

The cookbook does the same step inline; here it's a named function so the :class:~evsys_sdk.algorithms.sdft.SDFT algorithm and tests share the implementation.

paramdatumtinker.Datum
paramteacher_prompt_lenint
parammax_context_lengthint

Returns

evsys_sdk.training.sdft_data.CompletionSlice
funcbuild_teacher_forced_sequence(teacher_prompt, completion_tokens) -> tinker.ModelInput

Append completion tokens to a teacher prompt to form the sequence the teacher will score.

paramteacher_prompttinker.ModelInput
paramcompletion_tokensSequence[int]

Returns

tinker.tinker.ModelInput
funcbuild_topk_targets(*, student_data, completion_slices, teacher_topk_logprobs, topk=20, vocab_size=None, skip_first_n=3) -> tuple[list[tinker.Datum], dict[str, float]]

Build cross_entropy Datums with (N, K) soft targets from a batch of teacher top-K responses. Pure function - no I/O.

paramstudent_datalist[tinker.Datum]

Student Datums from :func:student_datum_from_rollout. Their mask tells us which positions are completion tokens; the loss targets the K-best teacher tokens at each of those positions.

paramcompletion_sliceslist[CompletionSlice]

Output of :func:extract_completion_tokens (one per datum). Carries the teacher_prompt_len needed to index into teacher_topk_logprobs.

paramteacher_topk_logprobslist[list[list[tuple[int, float]] | None] | None]

One list per datum. Each is a per-position list (length = teacher_prompt + completion length); each position is either None or a list of (token_id, logprob) tuples (the teacher's top-K). On real tinker this comes from sample_async(topk_prompt_logprobs=K).topk_prompt_logprobs.

paramtopkint
= 20

How many of the teacher's top tokens to keep (truncates if the teacher returned fewer).

paramvocab_sizeint | None
= None

If set, drop teacher tokens >= vocab_size (handles special tokens vLLM may emit outside the student's vocab).

paramskip_first_nint
= 3

Skip the first N completion positions from the loss. Matches the reference SDFT paper (default 3).

Returns

``(new_datums, metrics)`` where each new Datum has cross_entropy
func_make_topk_datum(source, targets_NK, weights_NK) -> tinker.Datum
paramsourcetinker.Datum
paramtargets_NKtorch.Tensor
paramweights_NKtorch.Tensor

Returns

tinker.tinker.Datum