This repository hosts a portable benchmarking harness for the Tenzir data pipeline engine. It focuses on repeatable measurement of realistic pipelines and operators across Tenzir releases.
Fetch the reference datasets (currently Suricata EVE JSON and Zeek conn logs),
derive helper artifacts such as CSV and key-value views, and store everything in
the platform-specific cache directory (e.g. ~/.cache/tenzir-bench/datasets):
bench prepareThe command can be re-run; add --force to refresh downloads.
Execute the benchmark suite or an individual pipeline. The command discovers
.tql files under benchmarks/, stages datasets from the cached dataset
store, and records per-run JSON reports under the state directory (for example,
~/.local/state/tenzir-bench/results).
bench runUse bench run path/to/pipeline.tql to target a specific scenario.
Download the reference results from the central location (s3). By default, downloads only the result data matching the current architecture. Only takes the results generated from the most recently published main branch and all published release versions of Tenzir.
Syncs all results when the --full flag is given.
Result artifacts are stored and synced using a deterministic layout derived from
the benchmark definition hash, input hash, runner, and Tenzir build identifier.
This keeps the local cache (~/.cache/tenzir-bench/results) aligned with the
remote bucket structure and prevents redundant downloads.
Metadata fetched from GitHub (release list and recent main commits) is cached
for 30 minutes; pass --refresh to bypass the TTL and force an update.
Compare the latest runs against a set of reference runs. By default, the benchmark results of the most recently published release version are used as the baseline, and the latest published results of the main branch are included for comparison. The results of the reference runs are fetched from a central s3 bucket by default. The compact output mode summarizes wall-clock time and peak RSS of the fastest run per pipeline.
bench eval --runs ~/.local/state/tenzir-bench/results --compactThe optional --base option
The full JSON report (without --compact) includes raw metrics, absolute
deltas, and percentage changes for every measured attribute.
When invoked without --base/--runs, bench eval automatically queries GitHub
for the most recent release tags and the latest commits on main, then selects
the corresponding artifacts that were previously synced via bench sync. It
uses the newest main-branch report available in the central location for the
current hardware. If no main results exist for this platform, bench eval
emits a warning and omits the main comparison. If neither release nor main
references are available, the command fails with an error message that suggests
running bench compare against locally built binaries instead. Use --strict
to make the command stop when references are missing or stale rather than
falling back to partial output.
Note: GitHub API calls honor the
GITHUB_TOKENenvironment variable when present but also work unauthenticated (subject to rate limits).
Upload benchmark reports to the configured publication target (for example, an
object storage bucket). Publishing is idempotent: previously uploaded artifacts
are skipped unless --force is specified.
bench publish --runs ~/.local/state/tenzir-bench/results --destination s3://tenzir-benchmarks/mainCredentials, bucket names, retention policies, and other publishing details are managed via the built-in defaults; authentication relies on the standard AWS CLI configuration in the current environment.
Runs the benchmarks for 2 Tenzir builds locally and compares the results directly. Uses cached results in case the binary, the benchmark TQL file and the defined input files did not change.
bench compare <path/to/baseline/bin/tenzir> <path/to/under-test/bin/tenzir> --compactThe compact output mode summarizes wall-clock time and peak RSS of the fastest run per pipeline in a compact table.
You can also reference synced artefacts directly, e.g. bench compare --baseline latest-release --under-test main to diff the cached release and main
results without locating binaries manually.
Each benchmark is a .tql file with YAML frontmatter followed by the pipeline
body. The harness injects BENCHMARK_INPUT_PATH and (when applicable)
BENCHMARK_OUTPUT_PATH so that pipelines can reference staged datasets without
hard-coded paths.
Frontmatter schema:
---
benchmark:
id: string # Globally unique identifier (required)
description: string # Short human-readable summary (optional)
tags: # Arbitrary key/value metadata (optional)
dataset: suricata-eve
operator: read_json
min_version: "5.17.0" # Minimum Tenzir version allowed (optional)
max_version: "6.0.0" # Maximum Tenzir version allowed (optional)
input: # Input dataset configuration (required)
path: suricata/eve.json # Relative to the managed dataset cache unless absolute
events: 984865 # Optional record count for throughput stats
measure: true # Use input bytes for throughput (boolean)
output: # Optional output measurement settings
path: tmp/eve-out.json # Relative to working directory
measure: false # Measure output bytes (input measure must be false)
env: # Extra environment variables for the run (optional)
TENZIR_CONSOLE_FORMAT: none
tenzir_args: # Extra CLI flags for the Tenzir binary (optional)
- --verbosity
- info
runner: time # Measurement runner (optional; defaults to 'time')
runtime: # Execution policy (optional)
warmup_runs: 1 # Warm-up iterations (default: 0)
measurement_runs: 3 # Timed runs (default: 1)
timeout_seconds: 600 # Per-run timeout (optional)
---
To add a new benchmark:
- Create
benchmark/benchmarks/<category>/<id>.tqlwith the required frontmatter and pipeline. - Run
bench run path/to/<id>.tqlto generate measurement reports. - Validate results with
bench eval.
Runners wrap the Tenzir invocation to collect metrics (e.g., /usr/bin/time
for wall clock/CPU/RSS, perf for hardware counters, cachegrind for cache
statistics).
Maintaining baselines for all released Tenzir versions helps detect regressions when new changes land. A typical workflow:
- Enumerate the desired release binaries (for example, via Nix or container images).
- For each release, invoke
bench run --tenzir-bin <path>. Reports are stored automatically in the state directory following the canonical<benchmark-hash>/<input-hash>/<build-id>layout. - After all releases have been measured, publish the collected results with
bench publish --runs ~/.local/state/tenzir-bench/results --destination …or archive the directory as needed.
Automating the loop is encouraged: a simple script can iterate over release
executables, run the suite, and finally call bench publish once per release or
for the entire batch. Downstream evaluators
can then diff a development build against any published baseline.
Every published run should include metadata tying it back to the benchmark
definition revision (e.g., Git commit hash) so that bench eval and
bench compare can refuse to mix incompatible baselines.
This project is licensed under the Apache License, Version 2.0.