Capacity Planning¶

You need to answer "how big does my infrastructure need to be?" -- but you can't test at scale against production, and guessing leads to either over-provisioning (wasted money) or under-provisioning (3 AM pages). Sonda lets you generate controlled, high-volume synthetic load against your observability backend so you can measure real ingestion limits, find cardinality ceilings, and size your infrastructure with data instead of hope.

What you need:

Sonda installed (Getting Started)
Docker with Compose v2 for backend testing (docker compose)
curl and jq for querying results

Start the test backend¶

All the scenarios in this guide push metrics to VictoriaMetrics. Start the included Docker Compose stack:

docker compose -f examples/docker-compose-victoriametrics.yml up -d

Wait for VictoriaMetrics to become healthy:

curl -s http://localhost:8428/health
# OK

Clean slate between tests

Tear down and recreate the stack between test runs to avoid leftover data skewing results: docker compose -f examples/docker-compose-victoriametrics.yml down -v && docker compose -f examples/docker-compose-victoriametrics.yml up -d

Test throughput limits¶

The first question in capacity planning: how many data points per second can your pipeline ingest before it falls behind? Sonda's rate field controls events per second per scenario, and multi-scenario files let you run several streams in parallel.

Smoke-check before scaling up¶

Before you push thousands of events per second at a backend, confirm Sonda generates at the target rate on your hardware. A 5-second stdout run is enough:

sonda -q metrics --name throughput_test --rate 1000 --duration 5s | wc -l
# ~5000 lines

If the line count roughly matches rate * duration, generation is keeping up. See Pipeline Validation for the full smoke-test pattern with exit-code checks and CI snippets. The rest of this section assumes the smoke check passed.

Multi-stream throughput test¶

This scenario runs 3 metric streams at 1,000 events/sec each -- 3,000 data points per second hitting VictoriaMetrics:

sonda run --scenario examples/capacity-throughput-test.yaml

examples/capacity-throughput-test.yaml (excerpt)

version: 2

defaults:
  rate: 1000
  duration: 60s
  encoder:
    type: prometheus_text
  sink:
    type: http_push
    url: "http://localhost:8428/api/v1/import/prometheus"
    content_type: "text/plain"
  labels:
    job: capacity_test
    service: api-gateway
    env: load-test

scenarios:
  - signal_type: metrics
    name: throughput_http_requests
    generator:
      type: sine
      amplitude: 200.0
      period_secs: 30
      offset: 500.0
    jitter: 10.0
    jitter_seed: 1

  - signal_type: metrics
    name: throughput_cpu_usage
    # ... (sine generator, inherits rate/duration/sink from defaults)

  - signal_type: metrics
    name: throughput_memory_bytes
    # ... (sine generator, inherits rate/duration/sink from defaults)

The full file contains all three scenarios with different generator shapes. Each runs on its own thread.

After the run completes, verify all three metrics arrived:

curl -s "http://localhost:8428/api/v1/query?query=throughput_http_requests" | jq '.data.result | length'
curl -s "http://localhost:8428/api/v1/query?query=throughput_cpu_usage" | jq '.data.result | length'
curl -s "http://localhost:8428/api/v1/query?query=throughput_memory_bytes" | jq '.data.result | length'

Scale the test up¶

To find your saturation point, increase the rate until you see ingestion errors or data loss. Override rates from the CLI without editing the YAML:

# 5,000 events/sec on a single stream
sonda -q metrics --name throughput_test --rate 5000 --duration 30s \
  --value-mode sine --amplitude 50 --period-secs 60 --offset 100 \
  --label job=capacity_test --label env=load-test \
  --output /tmp/throughput-5k.txt

wc -l < /tmp/throughput-5k.txt
# ~150000 lines

Disk and network are usually the bottleneck

Sonda can generate tens of thousands of events per second on modest hardware. The limit is almost always on the receiving end -- network bandwidth, disk I/O on the TSDB, or HTTP connection limits. Monitor your backend's resource usage during the test, not just Sonda's output.

Find cardinality limits¶

High series cardinality (many unique label combinations) is the most common cause of TSDB performance degradation. Pod churn in Kubernetes, ephemeral containers, and misconfigured service discovery all create cardinality explosions. Sonda's cardinality_spikes feature lets you simulate these explosions in a controlled way.

How cardinality spikes work¶

A cardinality spike injects a dynamic label with many unique values during a recurring time window. Outside the window, the label is absent (single series). During the window, each tick gets a different label value, creating cardinality unique series.

Time ──────────────────────────────────────────────────────►

  1 series   │ 500 series  │  1 series   │ 500 series  │
  (baseline) │ (spike)     │  (baseline) │ (spike)     │
─────────────┼─────────────┼─────────────┼─────────────┤
  0s         20s          60s           80s          120s
             ◄── for:20s ──►             ◄── for:20s ──►
             ◄──────── every:60s ────────►

Run a cardinality stress test¶

This scenario pushes two metrics with overlapping cardinality spikes -- 500 unique pod names and 200 unique endpoints:

sonda run --scenario examples/capacity-cardinality-stress.yaml

examples/capacity-cardinality-stress.yaml (excerpt)

version: 2

defaults:
  rate: 50
  duration: 180s
  encoder:
    type: prometheus_text
  sink:
    type: http_push
    url: "http://localhost:8428/api/v1/import/prometheus"
    content_type: "text/plain"

scenarios:
  - signal_type: metrics
    name: http_requests_total
    generator:
      type: constant
      value: 1.0
    labels:
      job: capacity_test
      service: api-gateway
      env: load-test
    cardinality_spikes:
      - label: pod_name
        every: 60s
        for: 20s
        cardinality: 500
        strategy: counter
        prefix: "pod-"

  - signal_type: metrics
    name: http_request_duration_seconds
    # ... (sine generator, 200 endpoint cardinality spike)

After the run, measure how many unique series were created:

# Count unique series for http_requests_total
curl -s "http://localhost:8428/api/v1/series?match[]=http_requests_total" \
  | jq '.data | length'
# Expected: ~501 (1 baseline + 500 from spike)

Scale cardinality progressively¶

Run multiple tests with increasing cardinality to find where your TSDB starts struggling. Track query latency and memory usage at each level:

Test	`cardinality`	Expected unique series	What to watch
Baseline	100	~101	Response time for `count()` queries
Medium	1,000	~1,001	TSDB memory usage, compaction time
High	5,000	~5,001	Query timeouts, ingestion backpressure
Extreme	10,000	~10,001	OOM events, disk I/O saturation

You can override the cardinality from the CLI for a quick single-metric test:

sonda -q metrics --name cardinality_test --rate 50 --duration 60s \
  --value 1 \
  --label job=capacity_test --label env=load-test \
  --spike-label pod_name --spike-every 60s --spike-for 30s \
  --spike-cardinality 5000 --spike-prefix "pod-"

Cardinality vs. throughput

Cardinality and throughput stress different parts of your TSDB. High throughput stresses the write path (WAL, network, CPU). High cardinality stresses the index (memory, compaction, query planning). Test both dimensions independently, then together.

Steady-state fleet simulation with dynamic labels

Cardinality spikes are time-windowed -- ideal for testing label explosions that come and go. If you want always-on cardinality (e.g., simulating a stable fleet of 50 hosts), use dynamic_labels instead. The label is present on every event, producing a constant number of series for the full duration. See Dynamic labels in the scenario field reference.

Simulate traffic spikes with bursts¶

Real traffic doesn't arrive at a steady rate. Black Friday, a viral post, or a cascading retry storm can spike your ingest rate by 10x in seconds. Sonda's bursts feature lets you simulate these spikes to test whether your pipeline handles sudden load without dropping data.

How bursts work¶

A burst window multiplies the base event rate for a short duration on a recurring schedule:

Rate ──►
         ┌──────┐                         ┌──────┐
  5000/s │      │                         │      │
         │      │                         │      │
         │      │                         │      │
   500/s ├──────┴─────────────────────────┴──────┴──────
         0s    5s                        30s   35s
               ◄─ for:5s ─►              ◄─ for:5s ─►
               ◄──────── every:30s ───────►

Run a burst test¶

This scenario runs at 500 events/sec with 10x bursts (5,000/sec) every 30 seconds:

sonda run --scenario examples/capacity-burst-test.yaml

examples/capacity-burst-test.yaml (excerpt)

version: 2

defaults:
  duration: 120s
  encoder:
    type: prometheus_text
  sink:
    type: http_push
    url: "http://localhost:8428/api/v1/import/prometheus"
    content_type: "text/plain"

scenarios:
  - signal_type: metrics
    name: burst_http_requests
    rate: 500
    generator:
      type: sine
      amplitude: 100.0
      period_secs: 30
      offset: 300.0
    jitter: 5.0
    jitter_seed: 10
    labels:
      job: capacity_test
      service: api-gateway
      env: load-test
    bursts:
      every: 30s
      for: 5s
      multiplier: 10.0

  - signal_type: metrics
    name: burst_queue_depth
    rate: 100
    # ... (sine generator, same burst config)

After the run, verify that data arrived during both steady-state and burst windows:

curl -s "http://localhost:8428/api/v1/query_range?\
query=burst_http_requests&start=$(date -v-3M +%s)&end=$(date +%s)&step=5s" \
  | jq '.data.result[0].values | length'

Combining bursts with cardinality spikes

For a worst-case scenario, combine both features in a single metric. This simulates a Kubernetes deployment rollout where pod churn (cardinality spike) and traffic ramp-up (burst) happen simultaneously:

name: worst_case_test
rate: 200
duration: 120s
generator:
  type: constant
  value: 1.0
labels:
  job: capacity_test
  env: load-test
bursts:
  every: 30s
  for: 10s
  multiplier: 5.0
cardinality_spikes:
  - label: pod_name
    every: 30s
    for: 10s
    cardinality: 1000
    strategy: counter
    prefix: "pod-"
encoder:
  type: prometheus_text
sink:
  type: stdout

During the 10-second overlap window, Sonda emits 1,000 events/sec (200 * 5) across 1,000 unique series. That's a sharp, realistic stress test.

Measure backend performance¶

Generating load is only half the story. You need to measure how your backend responds to that load. Here are the key metrics to capture during each test run.

VictoriaMetrics¶

Query these after your test completes:

# Ingestion rate (rows/sec received)
curl -s "http://localhost:8428/api/v1/query?\
query=rate(vm_rows_inserted_total[1m])" | jq '.data.result[0].value[1]'

# Active time series (cardinality)
curl -s "http://localhost:8428/api/v1/query?\
query=vm_cache_entries{type=\"storage/metricName\"}" | jq '.data.result[0].value[1]'

# Memory usage
curl -s "http://localhost:8428/api/v1/query?\
query=process_resident_memory_bytes" | jq '.data.result[0].value[1]'

Prometheus¶

If you're testing against Prometheus instead, use these queries:

# Head series count (active cardinality)
curl -s "http://localhost:9090/api/v1/query?\
query=prometheus_tsdb_head_series" | jq '.data.result[0].value[1]'

# Ingestion rate (samples/sec)
curl -s "http://localhost:9090/api/v1/query?\
query=rate(prometheus_tsdb_head_samples_appended_total[1m])" | jq '.data.result[0].value[1]'

# WAL corruption or truncation errors
curl -s "http://localhost:9090/api/v1/query?\
query=prometheus_tsdb_wal_truncations_failed_total" | jq '.data.result[0].value[1]'

What to record¶

Capture these metrics for each test run to build a sizing model:

Metric	Where to get it	Why it matters
Ingestion rate (samples/sec)	TSDB internal metrics	Confirms actual vs. intended write rate
Active time series	TSDB head series count	Tracks cardinality growth
Memory usage	`process_resident_memory_bytes`	Cardinality scales linearly with RAM
Disk growth rate	`du -sh` on TSDB data dir	Storage sizing for retention period
Query latency at load	Time a `count()` query	Detects index bloat from high cardinality

Calculate infrastructure sizing¶

With test data from the previous sections, you can build a practical sizing model. The formula is straightforward:

Storage estimate¶

daily_storage = (samples_per_second) * 86400 * (bytes_per_sample)
monthly_storage = daily_storage * 30 * (1 - compression_ratio)

Typical bytes_per_sample values (compressed, on disk):

Backend	Bytes per sample	Notes
VictoriaMetrics	0.4--1.0	Aggressive compression; lower end for uniform data
Prometheus	1.0--2.0	TSDB blocks with index overhead
Thanos / Mimir	1.5--3.0	Includes object storage overhead

Example calculation: You measured a sustained ingestion rate of 3,000 samples/sec during the throughput test.

daily_storage  = 3,000 * 86,400 * 1.0 bytes = ~259 MB/day  (VictoriaMetrics)
monthly_storage = 259 * 30 = ~7.8 GB/month

With 90-day retention: ~23 GB of disk.

Memory estimate¶

Cardinality is the primary driver of memory usage. From your cardinality stress tests, note the memory delta between the baseline and each spike level:

# Measure memory before and after a cardinality test
BEFORE=$(curl -s "http://localhost:8428/api/v1/query?query=process_resident_memory_bytes" \
  | jq -r '.data.result[0].value[1]')

# Run the cardinality stress test
sonda run --scenario examples/capacity-cardinality-stress.yaml

AFTER=$(curl -s "http://localhost:8428/api/v1/query?query=process_resident_memory_bytes" \
  | jq -r '.data.result[0].value[1]')

echo "Memory delta: $(( (AFTER - BEFORE) / 1024 / 1024 )) MB"

A rough rule of thumb: plan for 1--4 KB of RAM per active time series, depending on your backend and label complexity.

Sizing checklist¶

Run through this checklist for each environment you're sizing:

Dimension	Test scenario	Key measurement
Peak throughput	`capacity-throughput-test.yaml` at increasing rates	Max sustained samples/sec before errors
Cardinality ceiling	`capacity-cardinality-stress.yaml` at increasing cardinality	Max unique series before memory/query degradation
Burst headroom	`capacity-burst-test.yaml` with high multipliers	Whether data survives 10x spikes without loss
Disk budget	Any scenario at target rate for 10+ minutes	Measured bytes/sample for storage projection

Performance baselines¶

These tables show measured performance from actual benchmarks. They represent Sonda-side resource usage -- generation rates, memory, and encoder overhead -- not backend capacity.

Measured on Apple Silicon

Benchmarks ran on macOS with Apple Silicon, Sonda v0.8.0 release build, file sink, single thread per scenario. Your numbers will vary by hardware and sink type, but the key takeaway holds: Sonda generates millions of events per second on a single core with a flat ~7.5 MB memory footprint. Use the throughput and cardinality tests above to measure your own environment.

Generator throughput¶

Single CPU core, prometheus_text encoder, file sink:

Generator	Events/sec	Notes
`constant`	~7,200,000	Minimal computation per event
`csv_replay`	~6,200,000	Values pre-loaded into memory, cycles through them
`sawtooth`	~5,800,000	Simple linear ramp calculation
`uniform`	~5,700,000	RNG evaluation per event
`sine`	~5,400,000	Trigonometric calculation per event
`histogram`	~257,000 ticks/sec	Each tick emits 14 lines (12 buckets + count + sum)
`summary`	~349,000 ticks/sec	Each tick emits 6 lines (4 quantiles + count + sum)

Histogram and summary throughput

These generators are measured in ticks/sec because each tick produces multiple output lines. In terms of raw line throughput, histogram produces ~3.6M lines/sec and summary ~2.1M lines/sec.

Encoder bytes per event¶

Measured with 3 labels (job, instance, env), a typical metric name, and a float value:

Encoder	Bytes/event	Notes
`prometheus_text`	~76	Human-readable, uncompressed
`influx_lp`	~81	InfluxDB line protocol
`otlp`	~95	Protobuf, written to file (no gRPC framing)
`remote_write`	~97	Snappy-compressed protobuf
`json_lines`	~136	JSON overhead from keys and quoting

Note

remote_write and otlp bytes measured via file sink. Over-the-wire size with network sinks may differ due to batching and compression. Actual size also varies with metric name length, number of labels, and value precision.

Memory footprint¶

Sonda does not store per-series state -- it generates events on the fly. Memory is essentially flat regardless of cardinality:

Scenario	RSS
Single metric, any cardinality (100 to 50,000 series)	~7.5 MB
5 concurrent scenarios, 25,000 total events/sec	~7.5 MB

Memory is driven by the number of concurrent scenarios and sink buffering, not by series cardinality.

Kubernetes resource recommendations¶

Sonda's low resource footprint means you can run it almost anywhere. Suggested resource requests for Sonda pods:

Profile	Event rate	CPU request	Memory request	Use case
Small	up to 1,000/sec	50m	32 Mi	Development, CI pipeline checks
Medium	1,000--100,000/sec	100m	64 Mi	Integration testing, alert validation
Large	100,000--1,000,000+/sec	250m	128 Mi	Load testing, capacity planning

Set resource limits 2x above requests to accommodate bursts.

Quick reference¶

Task	Command
Start backend	`docker compose -f examples/docker-compose-victoriametrics.yml up -d`
Validate throughput scenario	`sonda --dry-run run --scenario examples/capacity-throughput-test.yaml`
Run throughput test	`sonda run --scenario examples/capacity-throughput-test.yaml`
Run cardinality stress test	`sonda run --scenario examples/capacity-cardinality-stress.yaml`
Run burst test	`sonda run --scenario examples/capacity-burst-test.yaml`
Count series in VM	`curl -s "http://localhost:8428/api/v1/series?match[]=<metric>" \\| jq '.data \\| length'`
Tear down	`docker compose -f examples/docker-compose-victoriametrics.yml down -v`

Related pages:

Scenario Fields -- cardinality_spikes, bursts, and rate reference
Sinks -- http_push configuration for backend targets
E2E Testing -- full Docker Compose test suite
Pipeline Validation -- quick smoke tests without Docker
Example Scenarios -- all example scenario files
Troubleshooting -- common issues and how to fix them