Alert Testing¶

3 a.m. The pager goes off for HighRequestLatency. By the time you log in, latency is back below threshold and the alert has cleared. You spend an hour reading dashboards and find nothing -- the spike was real, but it lasted 90 seconds and your for: 5m clause silently swallowed it. The alert is doing exactly what you told it to. You just told it the wrong thing.

That whole class of problem -- for: durations that swallow real spikes, gap-fill rules that fire during scrape outages, compound A AND B rules where the two signals never overlap -- only shows up in production because nothing else generates the right metric shape. Sonda does. You write the alert, run a scenario that crosses the threshold for exactly the duration you care about, and watch whether the alert fires.

This page is the entry point. Five focused sub-pages cover the patterns; the table below maps each common alert shape to the right one.

Pick your pattern¶

You want to test...	Go to	Generator
A simple `> threshold` rule	Threshold and `for:` duration	`sine`
A short `for:` clause (≤ 30s)	Threshold and `for:` duration	`sequence`
A long `for:` clause (minutes)	Threshold and `for:` duration	`constant`
Resolution / flapping behavior	Resolution and recovery	any + `gaps`
Compound `A AND B` rules	Compound and correlated alerts	multi-scenario
Cardinality guardrails	Cardinality explosion alerts	any + `cardinality_spikes`
Replaying a known incident	Replaying recorded incidents	`sequence` or `csv_replay`

The pages are written as a tour and link forward to one another, but each one stands on its own -- jump straight to the one that matches the rule you are testing.

The tour¶

Threshold and for: duration -- sine for predictable crossings, sequence for exact breach windows, constant for sustained load.
Resolution and recovery -- gap windows that drop the metric so you can confirm the alert clears.
Compound and correlated alerts -- phase_offset and clock_group to overlap two scenarios for A AND B rules.
Cardinality explosion alerts -- cardinality_spikes for testing series-count guardrails.
Replaying recorded incidents -- sequence for short patterns, csv_replay for production exports.

Push to a real backend¶

Once you can shape the alert pattern locally, push it into a real TSDB and verify the alert fires there. The push-and-query loop -- start the backend, run the scenario, curl the query API -- is the same one E2E Testing walks through, with the full coverage matrix of encoder and sink combinations.

For alerting specifically, the two scenarios you will reach for first are examples/vm-push-scenario.yaml (Prometheus text via http_push) and examples/remote-write-vm.yaml (remote_write to VictoriaMetrics, vmagent, or upstream Prometheus). Both land in the stack from examples/docker-compose-victoriametrics.yml:

# Start the stack
docker compose -f examples/docker-compose-victoriametrics.yml up -d

# Push test data
sonda metrics --scenario examples/vm-push-scenario.yaml

# Verify the metric exists (wait ~15s for ingestion)
curl "http://localhost:8428/api/v1/query?query=cpu_usage"

# Tear down
docker compose -f examples/docker-compose-victoriametrics.yml down -v

Service	Port	Purpose
sonda-server	8080	REST API for scenario management
VictoriaMetrics	8428	Time series database
vmagent	8429	Metrics relay agent
Grafana	3000	Dashboards (auto-provisioned)

See Docker Deployment for the full stack configuration.

Close the loop with Alertmanager

This stack verifies that data arrives in VictoriaMetrics, but does not prove alerts fire. To add vmalert, Alertmanager, and a webhook receiver to the stack, see the Alerting Pipeline guide.

Scrape model instead of push¶

If you prefer the Prometheus pull model, sonda-server exposes a scrape endpoint for each running scenario. Start the server and submit a scenario:

cargo run -p sonda-server -- --port 8080

# In another terminal:
curl -X POST -H "Content-Type: text/yaml" \
  --data-binary @examples/sine-threshold-test.yaml \
  http://localhost:8080/scenarios

The response includes a scenario ID. Configure Prometheus to scrape it:

prometheus.yml (scrape config)

scrape_configs:
  - job_name: sonda
    scrape_interval: 15s
    static_configs:
      - targets: ['localhost:8080']
    metrics_path: /scenarios/<scenario-id>/metrics

See Server API for the full API reference.

Quick reference¶

Pattern	Generator	Example file
Threshold crossing	`sine`	`sine-threshold-test.yaml`
Sustained breach	`constant`	`constant-threshold-test.yaml`
Alert resolution via gap	`constant` + `gaps`	`gap-alert-test.yaml`
Precise `for:` duration	`sequence`	`for-duration-test.yaml`
Compound alert	multi-scenario	`multi-metric-correlation.yaml`
Cardinality explosion	any + `cardinality_spikes`	`cardinality-alert-test.yaml`
Periodic spike / anomaly	`spike`	`spike-alert-test.yaml`
Incident replay (inline)	`sequence`	`sequence-alert-test.yaml`
Incident replay (file)	`csv_replay`	`csv-replay-metrics.yaml`
Push to VictoriaMetrics	any	`vm-push-scenario.yaml`
Remote write	any	`remote-write-vm.yaml`

Next steps¶

Verifying alerts fire end-to-end? See Alerting Pipeline to run vmalert, Alertmanager, and a webhook receiver with Docker Compose.

Validating alert rules in CI? See CI Alert Validation to catch broken rules before they reach production.

Validating a pipeline change? See Pipeline Validation.

Verifying recording rules? Check Recording Rules.

Browsing all example scenarios? See Example Scenarios.