CSV Import¶
You have a CSV file -- maybe a Grafana export from a production incident, maybe a hand-recorded
dataset -- and you want to turn it into a portable, parameterized scenario that uses Sonda's
generators instead of replaying raw values. sonda import analyzes the data, detects dominant
patterns, and generates scenario YAML you can run, share, and customize.
Why import instead of replay?¶
The csv_replay generator plays back raw CSV values verbatim. That is
useful for exact reproduction, but the output is tied to the original file. sonda import
takes a different approach:
- Portable -- the generated YAML uses generators (
steady,spike_event,leak,flap,sawtooth,step), so it runs without the original CSV file. - Parameterized -- you can tune rate, duration, and generator parameters after import.
- Shareable -- the YAML is self-contained. Drop it into a repo, CI pipeline, or Helm chart.
Use csv_replay when you need bit-for-bit fidelity. Use sonda import when you need the
shape of the data as a reusable scenario.
The workflow¶
sonda import has three modes that form a natural pipeline:
Step 1: Analyze¶
Start by understanding what the data looks like. --analyze is read-only -- it prints
detected patterns without generating any files.
CSV Import Analysis
============================================================
Column 1 (index 1): cpu_percent
Data points: 20
Range: [12.30, 96.10] Mean: 46.27
Detected pattern: steady (center=46.27, amplitude=41.90)
Column 2 (index 2): mem_percent
Data points: 20
Range: [45.20, 86.20] Mean: 59.88
Detected pattern: steady (center=59.88, amplitude=20.50)
Column 3 (index 3): disk_io_mbps
Data points: 20
Range: [5.00, 65.80] Mean: 25.04
Detected pattern: steady (center=25.04, amplitude=30.40)
Each column shows the metric name (from the header), basic statistics, and the detected pattern with extracted parameters.
Step 2: Generate¶
Once you know the patterns look right, generate a scenario YAML file:
The generated file is a valid multi-scenario YAML, ready for sonda run --scenario:
scenarios:
- signal_type: metrics
name: cpu_percent
rate: 1
duration: 60s
generator:
type: steady
center: 46.27
amplitude: 41.9
period: "60s"
encoder:
type: prometheus_text
sink:
type: stdout
- signal_type: metrics
name: mem_percent
rate: 1
duration: 60s
generator:
type: steady
center: 59.88
amplitude: 20.5
period: "60s"
encoder:
type: prometheus_text
sink:
type: stdout
# ... (one entry per column)
Single-column CSVs produce flat YAML
When the CSV has only one data column, the output is a flat scenario (no scenarios: wrapper).
Multi-column CSVs always produce the scenarios: list format for use with sonda run.
Step 3: Run¶
If you just want to see the output without saving a file, --run generates the scenario in
memory and executes it immediately:
cpu_percent 41.44404065390504 1775712694328
cpu_percent 46.07410906869991 1775712695333
cpu_percent 50.131242022026555 1775712696330
cpu_percent 55.42337922089686 1775712697333
Grafana CSV exports¶
sonda import understands Grafana's "Series joined by time" CSV format. It parses the
{__name__="...", key="value"} headers to extract metric names and labels automatically.
CSV Import Analysis
============================================================
Column 1 (index 1): up
Labels: {instance="localhost:9090", job="prometheus"}
Data points: 10
Range: [0.00, 1.00] Mean: 0.80
Detected pattern: sawtooth (min=0.00, max=1.00, period=4pts)
Column 2 (index 2): up
Labels: {instance="localhost:9100", job="node"}
Data points: 10
Range: [0.00, 1.00] Mean: 0.80
Detected pattern: sawtooth (min=0.00, max=1.00, period=6pts)
Labels are preserved in the generated YAML:
scenarios:
- signal_type: metrics
name: up
rate: 1
duration: 60s
generator:
type: sawtooth
min: 0.0
max: 1.0
period_secs: 4.0
labels:
instance: "localhost:9090"
job: prometheus
encoder:
type: prometheus_text
sink:
type: stdout
For details on exporting from Grafana, see the Grafana CSV Export Replay guide.
Selecting columns¶
By default, all non-timestamp columns are imported. Use --columns to pick specific ones
by their zero-based index:
CSV Import Analysis
============================================================
Column 1 (index 1): cpu_percent
Data points: 20
Range: [12.30, 96.10] Mean: 46.27
Detected pattern: steady (center=46.27, amplitude=41.90)
Column 2 (index 3): disk_io_mbps
Data points: 20
Range: [5.00, 65.80] Mean: 25.04
Detected pattern: steady (center=25.04, amplitude=30.40)
Column 0 is always the timestamp and cannot be selected for import.
Detected patterns¶
The pattern detector uses statistical analysis to classify each column into one of six patterns. Each pattern maps to a Sonda generator or operational vocabulary alias.
| Pattern | What it looks like | Generator / alias | Key parameters |
|---|---|---|---|
| Steady | Low variance around a center | steady |
center, amplitude, period |
| Spike | Periodic outliers above a baseline | spike_event |
baseline, spike_height, spike_duration, spike_interval |
| Climb | Monotonic upward trend | leak |
baseline, ceiling, time_to_ceiling |
| Sawtooth | Repeating climb-reset cycles | sawtooth |
min, max, period_secs |
| Flap | Bimodal toggle (up/down) | flap |
up_value, down_value, up_duration, down_duration |
| Step | Constant-rate counter increments | step |
start, step_size |
The detector runs through these in priority order. When the data does not clearly match a more specific pattern, it falls back to steady.
Pattern detection is heuristic
The detector uses statistical thresholds (linear regression, IQR outlier detection, k-means clustering) to classify patterns. With very short time series (fewer than 10 data points), detection accuracy decreases. For best results, export at least 20-30 data points.
Customizing generated scenarios¶
The generated YAML is a starting point. After import, you can:
- Change the sink -- replace
stdoutwithremote_write,loki, or any other sink. - Adjust parameters -- tune
amplitude,period, orbaselineto match your needs. - Add scheduling -- add
gaps:,bursts:, orcardinality_spike:blocks. - Override rate and duration at generation time:
CLI reference¶
| Argument / Flag | Type | Default | Description |
|---|---|---|---|
<FILE> |
path | -- | CSV file to import. Supports Grafana exports and plain CSV. |
--analyze |
flag | -- | Print detected patterns (read-only). Conflicts with -o and --run. |
-o, --output <FILE> |
path | -- | Write generated scenario YAML to this path. Conflicts with --analyze and --run. |
--run |
flag | -- | Generate and immediately execute the scenario. Conflicts with --analyze and -o. |
--columns <INDICES> |
string | all | Comma-separated column indices (e.g., 1,3,5). Column 0 is the timestamp. |
--rate <RATE> |
float | 1.0 |
Events per second in the generated scenario. |
--duration <DURATION> |
string | 60s |
Duration of the generated scenario (e.g., 60s, 5m). |
Exactly one of --analyze, -o, or --run must be specified.
Combine with global flags
--dry-run, --verbose, and --quiet work with sonda import --run, just like any
other subcommand. Use sonda --dry-run import data.csv --run to see the resolved config
without emitting events.