Algorithms
The training recipe - SFT, RL, distillation, or any custom loss - that decides what each batch contains.
An algorithm is a training recipe. It owns one thing the rest of the stack does not: how each step's batch is built and which loss it trains under. Everything else - backend allocation, step/save cadence, evaluators, logging, artifacts - is shared plumbing. Any algorithm is supported (LoRA on any model, any loss) by overriding a single method, so you write your own whenever the recipe differs from SFT/RL/SDFT.
The contract
At the protocol level (src/evsys_sdk/protocols.py), an algorithm declares two
ClassVars and implements one method:
name: ClassVar[str]- registry key and the YAMLalgorithm.kind.Config: ClassVar[type]- a Pydantic model for the recipe's params, validated againstalgorithm.paramsin YAML.train(self, ctx: RunContext) -> RunResult- execute one full run.ctx: RunContextcarries everything the run needs:run_id,output_dir,config(the parsedExperimentConfig),data_store,log_store,backend, and a free-formextrasdict (where the backend handles, thetrain_rows, themodel_name, etc. live).- Returns a
RunResult:run_id,status("completed"/"failed"/"cancelled"),metrics(final scalars),artifacts(named paths, e.g.{"run_dir": ..., "checkpoint-final": ...}), and an optionalerrorstring.
The easy path: BaseAlgorithm
The Tinker-backed recipes don't reimplement train. They subclass
BaseAlgorithm (src/evsys_sdk/algorithms/base.py), which owns the whole
composer body and is itself the StepBuilder the loop drives. A concrete
algorithm overrides just the per-recipe pieces:
-
async setup(self, ctx: RunContext, backend: TinkerBackend) -> NoneOne-time prep before the loop. Stash per-algorithm state onself(tokenized datums for SFT; dataset + teacher client for SDFT/RL) and must setself._steps_per_epoch. Thebackendis the already-allocated LoRA training client; callbackend.get_tokenizer(),backend.save_for_sampler(...), etc. Returns nothing. -
async build_batch(self, step_idx: int) -> TrainingBatchProduce the batch forstep_idx(0-based). SFT slices its static datums; RL and SDFT roll out on-policy here. The loss rides on the returned batch - there is no separate loss override. Returns aTrainingBatch. -
step_metrics(self, step_idx: int, batch: TrainingBatch, fb_result: Any) -> dict[str, float](Optional, defaults to{}.) Per-step metrics computed from the forward-backward result.fb_resultis what the backend'sforward_backward_*_async()future resolved to; inspectfb_result.loss_fn_outputs(one entry per Datum) to derive, e.g.,train_mean_nll.
BaseAlgorithmConfig supplies the shared knobs every recipe inherits
(extra="forbid"): learning_rate, num_epochs, batch_size, max_steps
(a hard cap that wins over num_epochs * steps_per_epoch), lora_rank,
renderer_name, enable_thinking, save_every / save_at_fractions,
callbacks, and Adam knobs. A subclass adds its own and may re-declare a field
to change its default (RL drops learning_rate to 1e-5).
TrainingBatch - where the loss lives
From src/evsys_sdk/training/loop.py, the dataclass build_batch returns:
data: list[tinker.Datum]- the step's training examples. The loss mask on each Datum decides which tokens are supervised (the algorithm's call, not the dataset's).loss_fn: tinker.types.LossFnType | LossCallable- either a server-recognized string name ("cross_entropy","importance_sampling", …) for a server-side loss, or a Python callable for a client-side custom loss. The loop dispatches the right backend method based on the type:forward_backward_asyncfor a string,forward_backward_custom_asyncfor a callable. ALossCallableisCallable[[model_output, batch_metadata], Any]returning a scalar tensor for the backward pass.loss_fn_config: dict[str, Any] | None- extra loss params; only used whenloss_fnis a string.metrics: dict[str, float]- algorithm-precomputed per-step metrics (e.g. reward stats, teacher entropy) merged into the per-step log row.
This is the seam for any loss: focal loss, DPO, GRPO, a custom RL objective -
override build_batch, set the loss_fn, and the shared loop drives it.
Use a built-in
algorithm:
kind: sft # sft | rl | sdft | local_sft | local_rl | mock_sft
params:
learning_rate: 1.0e-4
batch_size: 4
max_steps: 200
lora_rank: 8
supervise: all_assistant # sft-only: which assistant turns get loss
save_at_fractions: [1.0]| Built-in | What it does |
|---|---|
sft | Supervised fine-tuning on Tinker. Tokenizes chat rows to static datums with assistant-span masks; loss_fn="cross_entropy". Knobs: max_seq_len, supervise. |
rl | On-policy RL on Tinker. Rolls out the batch through harbor's Job engine, group-normalizes advantages, emits loss_fn="importance_sampling" Datums. Knobs: num_samples, verifier_name, agent_import_path, n_concurrent. |
sdft | Self-distillation FT on Tinker. On-policy student rollout → frozen-teacher top-K scoring → cross_entropy soft-target distillation. Knobs: topk, max_context_length, demo_template. |
local_sft | TRL SFTTrainer on the local backend (transformers + peft + trl). LoRA on any HF model, runs on CPU/MPS/CUDA. |
local_rl | TRL GRPOTrainer on the local backend, reward driven by a registered verifier (verifier_kind / verifier_params). |
mock_sft | Fake SFT for tests: logs a deterministic loss curve and writes stub checkpoints. No backend needed. |
Create your own
Subclass a concrete recipe and override one method - here, SFT with a custom client-side loss:
from typing import ClassVar
from evsys_sdk.algorithms.sft import SFT, SFTConfig
from evsys_sdk.registry import register_algorithm
from evsys_sdk.training.loop import TrainingBatch
class FocalSFTConfig(SFTConfig): # inherits extra="forbid" + all SFT knobs
gamma: float = 2.0 # focal-loss focusing parameter
@register_algorithm("focal_sft")
class FocalSFT(SFT):
name: ClassVar[str] = "focal_sft"
Config: ClassVar[type] = FocalSFTConfig
async def build_batch(self, step_idx: int) -> TrainingBatch:
batch = await super().build_batch(step_idx) # reuse SFT's slicing
def focal_loss(model_output, meta): # a LossCallable
import torch
logp = model_output.logprobs # per-token target logprob
p = logp.exp()
return -((1 - p) ** self.cfg.gamma * logp).mean()
batch.loss_fn = focal_loss # client-side custom loss
return batchalgorithm:
kind: focal_sft
params:
gamma: 2.0
lora_rank: 8
max_steps: 200For a recipe built from scratch, subclass BaseAlgorithm instead and implement
setup + build_batch (+ optionally step_metrics).
Ship it in a package
A third-party package self-registers via a Python entry point in the
evsys_sdk.algorithms group:
[project.entry-points."evsys_sdk.algorithms"]
my_dpo = "my_pkg.algorithms:MyDPO"Importing evsys_sdk imports the target module, running its @register_algorithm
decorator - the recipe is selectable by name from any config.yaml, no SDK fork.