Usage
Subcommands
| Subcommand | Description |
|---|---|
run | Execute a single scenario from a YAML file |
matrix | Execute a scenario matrix for parameter exploration |
run
Execute a scenario defined in a YAML file.Usage
Arguments
| Argument | Description |
|---|---|
<filePath> | Path to the .scenario.yaml file |
Options
| Option | Description | Default |
|---|---|---|
-l, --live | Run in live mode, ignoring mocks | false |
Examples
matrix
Execute a scenario matrix for exploring parameter combinations.Usage
Arguments
| Argument | Description |
|---|---|
<configPath> | Path to the matrix configuration YAML file |
Options
| Option | Description | Default |
|---|---|---|
--dry-run | Show matrix analysis without executing | false |
--parallel <number> | Maximum parallel test runs | 1 |
--filter <pattern> | Filter parameter combinations by pattern | - |
--verbose | Show detailed progress information | false |
Examples
Scenario YAML Format
Basic Structure
Scenario Fields
| Field | Description | Required |
|---|---|---|
name | Scenario name | Yes |
description | Scenario description | No |
plugins | List of plugins to load | No |
setup | Setup configuration (mocks, files) | No |
run | List of execution steps | Yes |
judgment | How to determine pass/fail | No |
Evaluation Types
| Type | Description |
|---|---|
string_contains | Check if output contains a string |
regex_match | Match output against regex pattern |
llm_evaluation | Use LLM to evaluate response quality |
Judgment Strategies
| Strategy | Description |
|---|---|
all_pass | All evaluations must pass |
any_pass | At least one evaluation must pass |
Matrix Configuration
Basic Structure
Matrix Fields
| Field | Description | Required |
|---|---|---|
name | Matrix name | Yes |
description | Matrix description | No |
base_scenario | Path to base scenario file | Yes |
runs_per_combination | Runs per parameter combination | Yes |
matrix | List of parameter axes | Yes |
Output Structure
Scenario runs generate output in the_logs_ directory:
Mocking
Scenarios support mocking for deterministic testing:Plugins
Specify plugins to load for the scenario:plugin-sql, plugin-bootstrap, plugin-openai) are always loaded.

