EvSys

loop

TrainingLoop - algorithm-agnostic driver around a tinker training client.

The loop owns:

  • the for step in range(start, num_steps) iteration,
  • dispatching forward_backward_async then optim_step_async,
  • routing a callable loss_fn through forward_backward_custom_async,
  • periodic checkpoint saves via :class:~evsys_sdk.training.checkpoints.CheckpointManager,
  • periodic in-loop evaluation via :class:Evaluator objects,
  • writing one row per step into ctx.log_store (so the existing forward_step_metrics forwarder picks them up unchanged),
  • a final "final" checkpoint at the end of training.

What the loop does NOT own: data shaping. A :class:StepBuilder decides what each batch contains and computes algorithm-specific metrics. That's the seam SFT / SDFT / RL plug into.

Designed to be exercised against :class:~evsys_sdk.training.backend.MockBackend end-to-end so tests don't need a real tinker session.

attributelogger
= logging.getLogger(__name__)
attribute__all__
= ['Evaluator', 'LoopArtifacts', 'StepBuilder', 'TrainingBatch', 'TrainingLoop']