EvSys

workspace

Workspace - local cache for remote datasets/benchmarks.

Remote-first: datasets live in the backend (D20, accessed via EvsysStore over the gateway). Streaming every row over HTTP during training is slow, so the agent materializes a dataset to a local JSONL once and trains from the local file. On pull_dataset the local copy is reused if present and complete; otherwise it's fetched from remote, written, and cached.

Safe to cache: datasets are versioned and immutable per version, so a given dataset_id never changes - the only risk is a partial pull, guarded by a .meta.json manifest (atomic rename + complete flag + n_rows match).

The workspace root ($EVSYS_WORKSPACE or ./.evsys) writes a self-ignoring .gitignore (*) on init, so nothing in it is ever tracked. Rows are written raw (D17); MaterializedDataset carries the dataset's format + transform so the trainer can render typed rows on read.

attribute__all__
= ['Workspace', 'MaterializedDataset', 'read_jsonl_rows']
funcread_jsonl_rows(path) -> list[dict[str, Any]]

Read a materialized .evsys/ JSONL (one payload per line) to dicts.

parampathstr

Returns

list[dict[str, typing.Any]]