SDFTDataset
The data interface the SDFT algorithm consumes per step.
Per the SDFT paper, each step needs batch_size (question, golden_answer)
pairs - the student rolls out on the question, the teacher scores
teacher-forced through the question+golden_answer demo.
Functions
func__len__(self) -> intparamselfReturns
intfuncget_batch(self, step_idx) -> tuple[list[str], list[str]]Return (questions, golden_answers) of length batch_size.
paramselfparamstep_idxintReturns
tuple[list[str], list[str]]