EvSys

BasicLoopAgent

Drive Chat(TinkerLLM(model_path)) and record the rollout: token-level rollout_details + the completion text on the AgentContext, and the completion written to the agent dir for the host-side verifier to score.

Functions

func__init__(self, *, model_name, model_path=None, renderer_name=None, max_tokens=512, temperature=1.0, max_turns=1, system_prompt=None, model_client='tinker', api_base=None, **kw) -> None
paramself
parammodel_namestr
parammodel_pathstr | None
= None
paramrenderer_namestr | None
= None
parammax_tokensint
= 512
paramtemperaturefloat
= 1.0
parammax_turnsint
= 1
paramsystem_promptstr | None
= None
parammodel_clientstr
= 'tinker'
paramapi_basestr | None
= None
paramkwAny
= {}

Returns

None
funcname() -> str

Returns

str
funcversion(self) -> str | None
paramself

Returns

str | None
funcsetup(self, environment) -> None
paramself
paramenvironmentBaseEnvironment

Returns

None
func_build_llm(self) -> Any

The harbor sampler for this rollout. model_client picks it: "tinker" → on-policy TinkerLLM (needs model_path); "litellm" → harbor's litellm LLM for any provider (model_name a litellm string, e.g. "anthropic/claude-opus-4-1"; keys from the provider env vars).

collect_rollout_details is ON for tinker (we want token-level rollouts for training) but OFF for litellm: it makes harbor request logprobs + extra_body.return_token_ids, which closed APIs (Anthropic, OpenAI) reject with a 400. API-model benchmarking only needs the completion + reward + usage (cost/tokens), not token ids - those are still captured from the response. See _trial_to_trajectory.

paramself

Returns

typing.Any
func_cache_key(self) -> tuple
paramself

Returns

tuple
func_shared_llm(self) -> Any

The cached LLM for this rollout's (loop, config) - built once per harbor job and reused by every trial (see the module cache note). The first build is warmed under a per-loop lock so concurrent trials share ONE sampling client instead of each creating their own.

paramself

Returns

typing.Any
funcrun(self, instruction, environment, context) -> None
paramself
paraminstructionstr
paramenvironmentBaseEnvironment
paramcontextAgentContext

Returns

None

On this page