Skip to content

Synthetic Monitoring

Your dashboards look great -- until the data source goes quiet and you stare at flat lines wondering if it's a real outage or a broken scrape config. Long-running synthetic monitoring gives you a persistent baseline of known metrics flowing through your stack, so you can tell "no data" from "data stopped arriving" at a glance.

This guide walks you through deploying sonda-server on Kubernetes, submitting scenarios that run for hours or days, scraping the generated metrics with Prometheus, and building Grafana dashboards to monitor both the synthetic data and Sonda itself.

What you need:

  • A Kubernetes cluster (local or remote)
  • kubectl and helm CLI tools installed
  • curl and jq for API calls
  • Familiarity with Prometheus scraping and Grafana dashboards

Set up a local Kubernetes cluster

If you already have a cluster (EKS, GKE, AKS, or an existing local one), skip to Deploy sonda-server.

For local development and testing, you need a lightweight Kubernetes distribution that runs on your workstation. Here are the most practical options:

Tool Best for Runs on
kind CI pipelines, fast throwaway clusters Linux, macOS, Windows (WSL2)
k3d k3s in Docker, built-in registry support Linux, macOS, Windows (WSL2)
minikube Broad driver support, add-on ecosystem Linux, macOS, Windows (WSL2)
OrbStack Native macOS experience, low resource usage macOS only

All four require Docker (or a compatible container runtime) installed and running.

kind runs Kubernetes nodes as Docker containers. It starts in under 30 seconds and is the lightest option.

# Install (macOS/Linux)
brew install kind

# Or download the binary directly
# https://kind.sigs.k8s.io/docs/user/quick-start/#installation

# Create a cluster
kind create cluster --name sonda-lab

# Verify
kubectl cluster-info --context kind-sonda-lab

Port mapping for kind

kind clusters don't expose container ports to the host by default. If you need NodePort access (for Prometheus or Grafana outside the cluster), create the cluster with a config:

kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    extraPortMappings:
      - containerPort: 30080
        hostPort: 30080
        protocol: TCP
kind create cluster --name sonda-lab --config kind-config.yaml

k3d wraps k3s (Rancher's lightweight Kubernetes) inside Docker. It supports built-in port mapping and a local image registry out of the box.

# Install (macOS/Linux)
brew install k3d

# Create a cluster with port mapping
k3d cluster create sonda-lab -p "8080:80@loadbalancer"

# Verify
kubectl cluster-info

minikube is the most established option. It supports Docker, Hyperkit, Hyper-V, and other drivers.

# Install (macOS/Linux)
brew install minikube

# Start with Docker driver (recommended)
minikube start --driver=docker --profile sonda-lab

# Verify
kubectl cluster-info --context sonda-lab

Windows WSL2

On Windows, install minikube inside your WSL2 distribution and use the Docker driver. Make sure Docker Desktop's WSL2 backend is enabled. The same commands apply inside the WSL2 terminal.

OrbStack provides a native macOS Kubernetes experience with minimal resource usage. It runs a single-node k8s cluster that starts automatically.

# Install
brew install orbstack

# Kubernetes is enabled by default -- just verify
kubectl cluster-info

Once your cluster is running and kubectl get nodes shows a Ready node, you're set.


Deploy sonda-server

Sonda includes a Helm chart that deploys sonda-server as a Kubernetes Deployment with health probes, a ClusterIP Service, and optional scenario injection via ConfigMap.

helm install sonda ./helm/sonda

Wait for the pod to become ready:

kubectl get pods -l app.kubernetes.io/name=sonda -w

You should see 1/1 Running within 15--20 seconds. The Deployment configures liveness and readiness probes against GET /health, so Kubernetes restarts the pod automatically if the server becomes unresponsive.

Customizing the deployment

Override common settings with --set:

# Pin a specific image version
helm install sonda ./helm/sonda --set image.tag=0.4.0

# Custom port and resource limits
helm install sonda ./helm/sonda \
  --set server.port=9090 \
  --set resources.requests.cpu=200m \
  --set resources.limits.memory=512Mi

See Kubernetes deployment for the full chart reference.

Verify the server is healthy by port-forwarding to it:

kubectl port-forward svc/sonda 8080:8080 &
curl http://localhost:8080/health
# {"status":"ok"}

Now let's submit some long-running scenarios.


Submit long-running scenarios

A long-running scenario is simply a scenario YAML without a duration field. It runs indefinitely until you stop it with DELETE /scenarios/{id}.

examples/long-running-metrics.yaml
name: continuous_cpu
rate: 10
generator:
  type: sine
  amplitude: 50.0
  period_secs: 60
  offset: 50.0
labels:
  instance: api-server-01
  job: sonda
encoder:
  type: prometheus_text
sink:
  type: stdout

Submit it to the server:

ID=$(curl -s -X POST -H "Content-Type: text/yaml" \
  --data-binary @examples/long-running-metrics.yaml \
  http://localhost:8080/scenarios | jq -r '.id')

echo "Scenario started: $ID"

The scenario runs in a background thread inside the server. Submit as many as you need -- each gets its own thread and scrape endpoint.

Multiple scenarios for richer coverage

Submit several scenarios with different shapes to simulate a realistic environment: a sine wave for CPU, a step counter for requests, a constant for an up gauge. Each scenario gets its own /scenarios/{id}/metrics endpoint that Prometheus can scrape independently.

To verify it's running:

# List all running scenarios
curl -s http://localhost:8080/scenarios | jq '.[] | {id, name, status}'

# Check live stats for your scenario
curl -s http://localhost:8080/scenarios/$ID/stats | jq .

For the full API reference, see Server API.


Scrape metrics with Prometheus

Each running scenario exposes its metrics at GET /scenarios/{id}/metrics in Prometheus text exposition format. You can point Prometheus (or any compatible scraper like vmagent) at this endpoint.

Static scrape config

If you know the scenario ID ahead of time, configure a static scrape job:

prometheus-scrape.yaml
scrape_configs:
  - job_name: sonda
    scrape_interval: 15s
    metrics_path: /scenarios/<SCENARIO_ID>/metrics
    static_configs:
      - targets: ["sonda.default.svc:8080"]

Replace <SCENARIO_ID> with the UUID returned by POST /scenarios. The target address uses the Kubernetes Service DNS name (sonda.<namespace>.svc).

Prometheus ServiceMonitor

If you run the Prometheus Operator (kube-prometheus-stack), you can create a ServiceMonitor to auto-discover sonda-server. The Sonda Helm chart does not include a ServiceMonitor template today, so create one manually:

sonda-servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: sonda
  labels:
    release: prometheus  # must match your Prometheus Operator's selector
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: sonda
  endpoints:
    - port: http
      interval: 15s
      path: /scenarios/<SCENARIO_ID>/metrics
kubectl apply -f sonda-servicemonitor.yaml

One path per ServiceMonitor endpoint

Each ServiceMonitor endpoint scrapes a single path. If you have multiple running scenarios, you need one endpoints entry per scenario ID (each with a different path). For dynamic discovery, consider using a relabeling rule or a script that queries GET /scenarios and updates the scrape config.

Using vmagent instead of Prometheus

vmagent supports the same scrape_configs format. Point it at sonda-server using a standard static scrape config. If you're already running the VictoriaMetrics Docker Compose stack, add sonda-server as a scrape target in the vmagent config.


Build Grafana dashboards

Once Prometheus is scraping your synthetic metrics, you can visualize them in Grafana.

Sonda ships with a Sonda Overview dashboard (docker/grafana/dashboards/sonda-overview.json) that shows metric values, event rates, and gap/burst indicators. You can import it directly into any Grafana instance connected to a Prometheus-compatible datasource.

Import the shipped dashboard

  1. Open Grafana and go to Dashboards > Import.
  2. Upload docker/grafana/dashboards/sonda-overview.json or paste its contents.
  3. Select your Prometheus datasource when prompted.
  4. The dashboard uses template variables $datasource and $job -- set $job to sonda (or whatever job label your scenarios use).

Build a custom panel

For a focused monitoring panel, create a new dashboard with a time series visualization and query your synthetic metric directly:

continuous_cpu{job="sonda", instance="api-server-01"}

Add a second panel showing the emission rate over time:

rate(continuous_cpu{job="sonda"}[1m])

Threshold lines

Add a fixed threshold line in the Grafana panel options (e.g., at 90 for a CPU alert threshold). This gives you a visual reference for when the sine wave crosses your alert boundary.

With dashboards in place, you can see your synthetic data flowing at a glance. Next, let's make sure Sonda itself stays healthy.


Monitor sonda-server health

The stats API tells you whether each scenario is emitting as expected. Poll it periodically or build monitoring around it.

Health endpoint

The simplest check -- Kubernetes already uses this for liveness and readiness probes:

curl http://localhost:8080/health
# {"status":"ok"}

Per-scenario stats

The /scenarios/{id}/stats endpoint returns live stats including event counts, current emission rate, bytes emitted, error counts, and gap/burst state:

curl -s http://localhost:8080/scenarios/$ID/stats | jq .

Key fields to watch:

Field What it tells you
total_events Running count of emitted events. Should increase steadily.
current_rate Actual emission rate. Compare against your scenario's rate.
errors Error count. Should be 0 for healthy scenarios.
uptime Time since scenario started. Confirms it hasn't restarted.

List all scenarios

Check that all your submitted scenarios are still running:

curl -s http://localhost:8080/scenarios | jq '.[] | {name, status}'

If a scenario shows status: "stopped" unexpectedly, re-submit it.

Scripting a health check

Wrap the stats check in a simple script that alerts you if events stop flowing:

check-sonda.sh
#!/bin/bash
SONDA_URL="http://localhost:8080"

for id in $(curl -s "$SONDA_URL/scenarios" | jq -r '.[].id'); do
  events=$(curl -s "$SONDA_URL/scenarios/$id/stats" | jq '.total_events')
  name=$(curl -s "$SONDA_URL/scenarios/$id" | jq -r '.name')
  echo "$name ($id): $events events"
done

Rotate scenarios

Test patterns change over time. You might start with a sine wave to validate dashboards, then switch to a sequence generator to test alert thresholds. Scenario rotation is straightforward: stop the old scenario and start a new one.

Stop and replace

# Stop the running scenario
curl -s -X DELETE http://localhost:8080/scenarios/$ID | jq .
# {"id":"...","status":"stopped","total_events":12345}

# Submit a new scenario
NEW_ID=$(curl -s -X POST -H "Content-Type: text/yaml" \
  --data-binary @examples/sequence-alert-test.yaml \
  http://localhost:8080/scenarios | jq -r '.id')

echo "New scenario: $NEW_ID"

Scrape config update required

When you replace a scenario, the new scenario gets a different UUID. If your Prometheus scrape config uses the scenario ID in the metrics_path, you need to update it to point at the new ID.

Scripted rotation

For scheduled rotations (e.g., different patterns during business hours vs. overnight), wrap the stop-and-start sequence in a cron job or Kubernetes CronJob:

rotate-scenario.sh
#!/bin/bash
SONDA_URL="http://localhost:8080"
SCENARIO_FILE="$1"

# Stop all running scenarios
for id in $(curl -s "$SONDA_URL/scenarios" | jq -r '.[].id'); do
  curl -s -X DELETE "$SONDA_URL/scenarios/$id" > /dev/null
done

# Start the new scenario
curl -s -X POST -H "Content-Type: text/yaml" \
  --data-binary "@$SCENARIO_FILE" \
  "$SONDA_URL/scenarios" | jq .
# Rotate to a new pattern
./rotate-scenario.sh examples/long-running-metrics.yaml

Alert on Sonda itself

Synthetic monitoring is only useful if you know when it breaks. If Sonda stops emitting, your dashboards go silent, and you need to distinguish "Sonda died" from "real outage."

Detect missing synthetic data

Create an alert rule that fires when your synthetic metric disappears. This uses the absent() function in PromQL:

sonda-watchdog-rules.yaml
groups:
  - name: sonda-watchdog
    interval: 30s
    rules:
      - alert: SondaSyntheticDataMissing
        expr: absent(continuous_cpu{job="sonda"})
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Synthetic monitoring data missing"
          description: >
            The metric continuous_cpu from Sonda has not been seen for 2 minutes.
            Either sonda-server is down or the scenario has stopped.

This fires if continuous_cpu{job="sonda"} hasn't been scraped for 2 minutes. Adjust the for: duration based on your scrape interval and tolerance for gaps.

Monitor the pod itself

Since sonda-server runs as a Kubernetes Deployment with health probes, standard kube-state-metrics alerts cover pod-level failures:

- alert: SondaPodNotReady
  expr: kube_pod_status_ready{pod=~"sonda.*", condition="true"} == 0
  for: 5m
  labels:
    severity: critical
  annotations:
    summary: "Sonda pod is not ready"

Layer your alerting

A robust setup uses both layers:

Layer What it catches Alert
Pod health Server crash, OOM kill, image pull failure SondaPodNotReady
Metric presence Scenario stopped, scrape misconfigured, data pipeline broken SondaSyntheticDataMissing

The pod alert fires fast (infrastructure issue). The metric-absent alert fires when the data pipeline is broken anywhere between Sonda and Prometheus -- which is exactly the kind of problem synthetic monitoring exists to catch.

Testing these alerts with Sonda

You can validate these watchdog rules using the same patterns from the Alert Testing and Alerting Pipeline guides. Submit a scenario, verify the alert stays silent, then DELETE the scenario and watch the absent() alert fire.


Quick reference

Task Command
Deploy sonda-server helm install sonda ./helm/sonda
Submit a scenario curl -X POST -H "Content-Type: text/yaml" --data-binary @scenario.yaml http://localhost:8080/scenarios
List running scenarios curl http://localhost:8080/scenarios
Check scenario stats curl http://localhost:8080/scenarios/<id>/stats
Scrape metrics curl http://localhost:8080/scenarios/<id>/metrics
Stop a scenario curl -X DELETE http://localhost:8080/scenarios/<id>
Health check curl http://localhost:8080/health

Related pages: