DataConfig
Where the training data comes from + how to transform it.
The preferred way to reference a dataset is by dataset_id (or
dataset_name): the SDK pulls it from the dashboard into the local
.evsys/ workspace and trains from that cache - so stored
experiment scripts are portable and don't depend on local file layout.
dataset_id / dataset_name take precedence over source_kind /
path, which remain as an offline / dev fallback.
Attributes
attributesource_kindLiteral['jsonl', 'json', 'in_memory', 'hf_dataset']= 'jsonl'Which built-in source loader to use (ignored when dataset_id/name set).
attributedataset_idstr | None= NoneDashboard dataset id - pulled into .evsys/ and trained from locally.
attributedataset_namestr | None= NoneDashboard dataset name - resolved to the latest version's id, then pulled.
attributepathstr | None= NoneFor 'jsonl' / 'json': absolute or store-relative path.
attributerowslist[dict[str, Any]] | None= NoneFor 'in_memory'.
attributehf_datasetstr | None= NoneFor 'hf_dataset': HuggingFace dataset id.
attributehf_splitstr= 'train'attributetransformslist[TransformSpec]= Field(default_factory=list)Applied in order to the raw rows.