Skip to main content

Dataset Format

A dataset provides the set of test cases evaluated by a run profile.

Use this dataset with:

vigilo run test --profile-file <profile.yaml> --dataset-file <dataset.yaml>

File Format

The CLI accepts YAML or JSON.

Recommended filename:

  • dataset.yaml

Top-level shape:

dataset_id: 018f1111-1111-7111-8111-111111111111
dataset_version: 1.0.0
cases: []

Top-Level Fields

  • dataset_id (UUID, required)
  • dataset_version (string, optional)
  • cases (array, optional; empty array if omitted)

cases[]

Each case should map to a task type expected by at least one profile case group.

cases:
- id: 018f1111-1111-7111-8111-111111111101
task_type: classification
case_group: classification
input:
user_message: "I love this product."
expected:
label: positive
context: {}
tags: [smoke, easy]
metadata:
source: synthetic

Fields:

  • id (UUID, required)
  • task_type (string, required)
  • case_group (string, optional)
  • input (object/value, required)
  • expected (object/value, optional)
  • context (object/value, optional)
  • tags (string[], optional)
  • metadata (object, optional)

Alignment with Run Profile

  • cases[].task_type should match case_groups[].applies_to.task_type.
  • cases[].tags can be matched against tags_any and tags_all.
  • case_group can be used for routing and reporting consistency.

Example

See the repository sample dataset:

  • example/dataset.yaml
  • Run profile configuration: web/docs/configuration/run-profile.mdx
  • Example project: example/README.md