EvSys

SFT

Attributes

attributenamestr
= 'sft'
attributeConfigtype
= SFTConfig

Functions

func_check_inputs(self, ctx) -> None
paramself
paramctxRunContext

Returns

None
funcsetup(self, ctx, backend) -> None
paramself
paramctxRunContext
parambackendTinkerBackend

Returns

None
funcbuild_batch(self, step_idx) -> TrainingBatch

Slice batch_size Datums for step_idx, wrapping the dataset when the slice straddles the end.

paramself
paramstep_idxint

Returns

evsys_sdk.training.loop.TrainingBatch
funcstep_metrics(self, step_idx, batch, fb_result) -> dict[str, float]

train_mean_nll from the per-position logprobs of each Datum, weighted by the loss mask.

Tinker's cross_entropy loss returns loss_fn_outputs[i]["logprobs"]: a per-position vector of log-probabilities of the target token (a "perfect" prediction has logprob 0; otherwise negative). The mean NLL is -sum(logprob * weight) / sum(weight) over the loss-mask positions, averaged across the batch.

paramself
paramstep_idxint
parambatchTrainingBatch
paramfb_resultAny

Returns

dict[str, float]
func_hyperparams_extra(self) -> dict[str, Any]
paramself

Returns

dict[str, typing.Any]

On this page