Dataset Format

A dataset provides the set of test cases evaluated by a run profile.

Use this dataset with:

vigilo run test --profile-file <profile.yaml> --dataset-file <dataset.yaml>

File Format

The CLI accepts YAML or JSON.

Recommended filename:

dataset.yaml

Top-level shape:

dataset_id: 018f1111-1111-7111-8111-111111111111
dataset_version: 1.0.0
cases: []

Top-Level Fields

dataset_id (UUID, required)
dataset_version (string, optional)
cases (array, optional; empty array if omitted)

`cases[]`

Each case should map to a task type expected by at least one profile case group.

cases:
  - id: 018f1111-1111-7111-8111-111111111101
    task_type: classification
    case_group: classification
    input:
      user_message: "I love this product."
    expected:
      label: positive
    context: {}
    tags: [smoke, easy]
    metadata:
      source: synthetic

Fields:

id (UUID, required)
task_type (string, required)
case_group (string, optional): explicit run-profile case-group id to use for this case.
input (object/value, required)
expected (object/value, optional)
context (object/value, optional)
tags (string[], optional)
metadata (object, optional)

Alignment with Run Profile

When case_group is omitted, cases[].task_type should match case_groups[].applies_to.task_type.
When case_group is omitted, cases[].tags can be matched against tags_any and tags_all.
When case_group is supplied, it is an explicit routing override. Vigilo selects the profile case_groups[].id with that exact value and does not fall back to task/tag matching if the id is missing.
When case_group is omitted, Vigilo selects matching profile case groups from task_type, tags_any, and tags_all.
case_group is part of immutable case content. Changing only case_group changes the case hash because it changes evaluator selection and aggregation behavior.

Dataset case content is stored as immutable case blobs for reproducibility and retry execution. Run profile persistence.mode: summary redacts execution-level case snapshots, but it does not remove the underlying dataset case blobs needed by workers.

Example

See the repository sample dataset:

example/dataset.yaml

Run profile configuration: web/docs/configuration/run-profile.mdx
Example project: example/README.md

File Format​

Top-Level Fields​

cases[]​

Alignment with Run Profile​

Example​

Related​

File Format

Top-Level Fields

`cases[]`

Alignment with Run Profile

Example

Related