Skip to content

Sink Batching

When you run a Sonda scenario, metrics often appear in chunks on stdout or arrive at VictoriaMetrics in bursts. That is batching at work, and it is intentional.

What batching is

A sink is where generated telemetry is delivered — your terminal, a file, or a backend like Loki or VictoriaMetrics. Some network sinks (loki, http_push, remote_write, otlp_grpc, kafka) are batching sinks. Instead of one network call per event, they collect events in a buffer and send them together. One large request is cheaper than a thousand small ones; that is the efficiency win.

The moment a batching sink empties its buffer and sends the events is called a flush. Every batching sink answers one question: when do I flush?

When a sink flushes

A batching sink flushes when any of these triggers fires, whichever comes first:

A batch flushes when ANY of these trips (whichever first)

  1. SIZE      -- buffer fills to batch_size
  2. TIME      -- buffer is older than max_buffer_age
  3. SHUTDOWN  -- the scenario ends

batch_size is the size threshold. max_buffer_age is the time threshold. The rest of this page covers both. The short version: a batch never grows larger than batch_size and never sits longer than max_buffer_age.

Why batching exists

Sending each metric event on its own would mean one syscall (for stdout/file/TCP) or one HTTP request (for network sinks) per event. At high rates, that overhead dominates. Batching collects events in a buffer and sends them together, trading a small delay for much higher throughput.

How each sink batches

Sonda uses two kinds of batching depending on the sink type:

Sink Batching Size Threshold Time Threshold Unit
stdout OS-level (BufWriter) ~8 KB (fixed) -- bytes
file OS-level (BufWriter) ~8 KB (fixed) -- bytes
tcp OS-level (BufWriter) ~8 KB (fixed) -- bytes
udp None (immediate) -- -- --
http_push Application-level 4 KiB (configurable) 5s (configurable) bytes
kafka Application-level 64 KiB (fixed) 5s (configurable) bytes
loki Application-level 5 entries (configurable) 5s (configurable) entries
remote_write Application-level 5 entries (configurable) 5s (configurable) entries
otlp_grpc Application-level 5 entries (configurable) 5s (configurable) entries

OS-level buffering (stdout, file, tcp)

These sinks wrap their output in Rust's BufWriter with a default ~8 KB buffer. Encoded metric lines accumulate in memory and are written to the destination when the buffer fills or when Sonda flushes explicitly at scenario end.

That is why stdout output appears in bursts. The terminal receives a chunk of lines each time the buffer flushes, not one line per event.

Application-level batching (http_push, kafka, loki, remote_write, otlp_grpc)

These sinks manage their own internal buffer. Each call to write() appends data. When the buffer reaches the configured threshold, the entire batch is sent as a single HTTP POST, Kafka record, or gRPC call.

Data does not appear at the destination until one of these happens:

  1. The batch fills and triggers a size-based flush.
  2. A non-empty batch ages past its time threshold and triggers a time-based flush.
  3. The scenario completes and Sonda flushes the remaining partial batch.

No batching (udp)

The UDP sink sends each encoded event as a single datagram immediately. There is no buffering.

Configuring batch size

Four sinks let you tune the batch threshold through the batch_size field in the sink config.

batch_size is in bytes. Default: 4096 (4 KiB).

Larger batches for high-rate scenarios
sink:
  type: http_push
  url: "http://localhost:8428/api/v1/import/prometheus"
  content_type: "text/plain"
  batch_size: 65536  # 64 KiB -- fewer requests at thousands of events/s

batch_size is in TimeSeries entries. Default: 5.

Larger remote write batches for high-rate scenarios
encoder:
  type: remote_write
sink:
  type: remote_write
  url: "http://localhost:8428/api/v1/write"
  batch_size: 100  # fewer requests at thousands of events/s

batch_size is in log entries. Default: 5.

Larger Loki batches for high-rate scenarios
sink:
  type: loki
  url: "http://localhost:3100"
  batch_size: 100  # fewer requests at thousands of events/s

batch_size is in data points / log records. Default: 5.

Larger OTLP batches for high-rate scenarios
encoder:
  type: otlp
sink:
  type: otlp_grpc
  endpoint: "http://localhost:4317"
  signal_type: metrics
  batch_size: 100  # fewer requests at thousands of events/s
Choosing a batch size

Smaller batches mean data appears at the destination sooner, but each batch carries HTTP or network overhead. For debugging and development, use small batches (e.g., batch_size: 1 for http_push) to see data arrive immediately. For load testing, keep the defaults or increase them to reduce request volume.

Time-based flushing

batch_size alone misses one case: a low-rate scenario. If you generate one log line every 20 seconds and batch_size is 5 entries, the buffer takes over a minute and a half to fill, and nothing reaches the backend until it does. To anyone watching Loki or VictoriaMetrics, the pipeline looks broken.

Here is that gap as a timeline. The scenario produces one event every ~4 seconds with batch_size: 5:

SIZE-ONLY FLUSHING -- buffer flushes only when FULL or at shutdown

 events arrive:   e.....e.....e.....e.....e.....e.....e.....
 buffer count:    1     2     3     4     5 -FLUSH          ...
 backend sees:    ........................[5 events arrive]
                  '------ ~20s of total silence ------'
                       "is my pipeline broken?"

max_buffer_age closes that gap. It is a time threshold that complements the size threshold. A non-empty batch is flushed once buffered longer than max_buffer_age, in addition to the size-triggered and shutdown flushes. Whichever threshold trips first wins. The batch never grows larger than batch_size and never sits longer than max_buffer_age.

Same scenario, same low event rate, now with max_buffer_age: 5s:

WITH max_buffer_age -- buffer also flushes when its contents get older than max_buffer_age

 events arrive:   e.....e.....e.....e.....e.....e.....e.....
 buffer:          1  2 |     1  2 |     1  2 |     1 ...
                       'FLUSH     'FLUSH     'FLUSH
                     (age>5s)   (age>5s)   (age>5s)
 backend sees:    .....#......#......#......#......#
                  '-- data every ~5s, not every ~20s --'

The batch never fills to batch_size at this rate, so the size trigger never fires. The time trigger does. Low-rate scenarios deliver promptly instead of buffering invisibly.

max_buffer_age is supported by all five application-level sinks: http_push, loki, remote_write, otlp_grpc, and kafka. It accepts a duration string ("5s", "500ms", "2m") and defaults to 5s when omitted, so low-rate scenarios get prompt first delivery with no configuration.

Low-rate scenario with explicit time threshold
version: 2
kind: runnable

defaults:
  rate: 0.05  # one event every 20 seconds
  encoder:
    type: json_lines

scenarios:
  - signal_type: logs
    name: slow_audit_logs
    log_generator:
      type: template
      templates:
        - message: "user {user} performed {action}"
          field_pools:
            user: ["alice", "bob"]
            action: ["login", "logout"]
    sink:
      type: loki
      url: "http://localhost:3100"
      batch_size: 100        # size threshold -- rarely reached at this rate
      max_buffer_age: "30s"  # time threshold -- flush a partial batch every 30s
    labels:
      job: sonda
      env: dev

Disabling time-based flushing

Set max_buffer_age: "0s" to turn time-based flushing off. The sink reverts to size-and-shutdown-only flushing, which is the behavior of batch_size by itself. This is the opt-out for high-rate streams that fill a batch in well under five seconds and do not need the extra flush path.

Disable time-based flushing for a high-rate stream
sink:
  type: http_push
  url: "http://localhost:8428/api/v1/import/prometheus"
  content_type: "text/plain"
  batch_size: 65536
  max_buffer_age: "0s"  # size-and-shutdown flushing only

The age is checked on write

max_buffer_age is evaluated each time an event is written to the sink. If a sink stops receiving writes (for example during a long scenario gap), a partially-full batch will not flush until the next write arrives or the scenario stops. This is expected behavior; the timer is driven by writes, not by a background clock.

Flush on exit

When a scenario completes — whether by reaching its configured duration, running out of events, or receiving a Ctrl+C (SIGINT/SIGTERM) — Sonda always calls flush() on the sink. Any data in the buffer is sent, so you never lose a partial batch under normal circumstances.

SIGKILL bypasses flush

If you terminate Sonda with kill -9 (SIGKILL), the process is killed immediately with no chance to flush. Any data in the buffer is lost. Use Ctrl+C or kill (SIGTERM) instead for a clean shutdown.

Practical implications

Stdout appears chunky at low rates. If you run a scenario at 1 event per second, you may not see output for several seconds while the ~8 KB buffer fills. This is normal. The data appears all at once when the buffer flushes or when the scenario ends.

Network sinks have delivery delay. With http_push at the default 4 KiB threshold, a scenario that produces small metrics (~100 bytes each) needs roughly 40 events to fill a batch. At 10 events per second, that is about 4 seconds before the first HTTP POST goes out. Raise batch_size for high-rate scenarios to reduce request volume.

Short scenarios may send only one batch. If your scenario runs for 5 seconds at 10 events per second (50 events total), the entire output is most likely sent as a single flush at the end, not during execution.

For details on configuring each sink, see Sinks.