EvSys

TrajectoryGroup

All rollouts sampled from one task (num_samples of them). The group-relative advantage baseline subtracts the within-group mean reward.

Attributes

attributetrajectorieslist[Trajectory]
attributerewardslist[float]

Functions

func__init__(self, trajectories) -> None
paramself
paramtrajectorieslist[Trajectory]

Returns

None

On this page