TrajectoryGroup
All rollouts sampled from one task (num_samples of them). The
group-relative advantage baseline subtracts the within-group mean reward.
Attributes
attributetrajectorieslist[Trajectory]attributerewardslist[float]Functions
func__init__(self, trajectories) -> Noneparamselfparamtrajectorieslist[Trajectory]Returns
None