filters/openpolicyagent: Add a sample benchmark to measure latency percentiles #3442

mefarazath · 2025-03-18T08:07:15Z

Add benchmark for latency distribution with decision logging
Fix a missed benchmark test case for race conditions

…ntiles - Add benchmark for latency distribution with decision logging - Fix a missed benchmark test case for race conditions Signed-off-by: Farasath Ahamed <[email protected]>

mefarazath · 2025-03-18T08:11:06Z

go.sum

@@ -44,6 +44,8 @@ github.com/armon/go-metrics v0.4.1 h1:hR91U9KYmb6bLBYLQjyM+3j+rcd/UhE+G78SFnF8gJ
 github.com/armon/go-metrics v0.4.1/go.mod h1:E6amYzXo6aW1tqzoZGT755KkbgrJsSdpwZ+3JqfkOG4=
 github.com/aryszka/jobqueue v0.0.3 h1:O5YbgzQCjRomudwnDTY5BrHUNJhvPHQHq7GfGpE+ybs=
 github.com/aryszka/jobqueue v0.0.3/go.mod h1:SdxqI6HZ4E1Lss94tey5OfjcAu3bdCDWS1AQzzIN4m4=
+github.com/benburkert/pbench v0.0.0-20160623210926-4ec5821845ef h1:+7ZJvJGiV4hUBdjhEDhfGdjBCOmhVi0YQ5n+6g/ei+k=


@AlexanderYastrebov based on comments in PR, i guess we are not okay with using this dependency.

Is it fine if we copy over the function/s used from library to our codebase?

After reading the code of the library I am not sure it does the right thing, e.g. I don't see how it accounts for warmup benchmark runs. Lets first understand what we want to achieve.

The standard way to compare benchmark results is to use https://pkg.go.dev/golang.org/x/perf/cmd/benchstat

The table then compares the two input files for each benchmark. It shows the median and 95% confidence interval summaries for each benchmark before and after the change, and an A/B comparison under "vs base".

Checking help for benchstat:

$ benchstat Usage: benchstat [flags] inputs... benchstat computes statistical summaries and A/B comparisons of Go benchmarks. It shows benchmark medians in a table with a row for each benchmark and a column for each input file. If there is more than one input file, it also shows A/B comparisons between the files. If a difference is likely to be noise, it shows "~". For details, see https://pkg.go.dev/golang.org/x/perf/cmd/benchstat. -alpha α consider change significant if p < α (default 0.05) -col projection split results into columns by distinct values of projection (default ".file") -confidence level confidence level for ranges (default 0.95) -filter query use only benchmarks matching benchfilter query (default "*") -format format print results in format: text - plain text csv - comma-separated values (warnings will be written to stderr) (default "text") -ignore keys ignore variations in keys -row projection split results into rows by distinct values of projection (default ".fullname") -table projection split results into tables by distinct values of projection (default ".config")

it looks like we can change -confidence if we need other than p95.

Also if you want to compare benchmark results for two commits you may use this script golang/go#63233 (comment)

mefarazath · 2025-03-18T08:28:18Z

Sample benchmark run with percentiles

goos: darwin
goarch: arm64
pkg: github.com/zalando/skipper/filters/openpolicyagent/opaauthorizerequest
cpu: Apple M1 Max
BenchmarkWithPercentiles/minimal-with-decision-logs-10         	  127410	      8796 ns/op	   45762 B/op	     815 allocs/op
BenchmarkWithPercentiles/minimal-with-decision-logs/P50-10     	  127410	     59125 ns/op
BenchmarkWithPercentiles/minimal-with-decision-logs/P95-10     	  127410	    216125 ns/op
BenchmarkWithPercentiles/minimal-with-decision-logs/P99-10     	  127410	    529500 ns/op
BenchmarkWithPercentiles/minimal-with-decision-logs/P99.9-10   	  127410	   1463334 ns/op
PASS

Results with benchstat

~ > benchstat minimal-percentiles-dl-old                                                                               10ms 
goos: darwin
goarch: arm64
pkg: github.com/zalando/skipper/filters/openpolicyagent/opaauthorizerequest
cpu: Apple M1 Max
                                                    │ minimal-percentiles-dl-old │
                                                    │           sec/op           │
WithPercentiles/minimal-with-decision-logs-10                        52.35µ ± 1%
WithPercentiles/minimal-with-decision-logs/P50-10                    112.0µ ± 2%
WithPercentiles/minimal-with-decision-logs/P95-10                    702.5µ ± 2%
WithPercentiles/minimal-with-decision-logs/P99-10                    15.47m ± 1%
WithPercentiles/minimal-with-decision-logs/P99.9-10                  16.14m ± 1%
geomean                                                              1.006m

                                              │ minimal-percentiles-dl-old │
                                              │            B/op            │
WithPercentiles/minimal-with-decision-logs-10                 68.50Ki ± 0%

                                              │ minimal-percentiles-dl-old │
                                              │         allocs/op          │
WithPercentiles/minimal-with-decision-logs-10                  1.084k ± 0%

Benchmark used to compare the performance of the new buffer implementation introduced by opa team in PR

~ > benchstat minimal-decision-logs-percentiles-old minimal-decision-logs-percentiles-buffer-impl
goos: darwin
goarch: arm64
pkg: github.com/zalando/skipper/filters/openpolicyagent/opaauthorizerequest
cpu: Apple M1 Max
                                                    │ minimal-decision-logs-percentiles-old │ minimal-decision-logs-percentiles-buffer-impl │
                                                    │                sec/op                 │        sec/op          vs base                │
WithPercentiles/minimal-with-decision-logs-10                                  52.354µ ± 1%            8.887µ ±  6%  -83.03% (p=0.000 n=10)
WithPercentiles/minimal-with-decision-logs/P50-10                              112.02µ ± 2%            59.31µ ±  0%  -47.05% (p=0.000 n=10)
WithPercentiles/minimal-with-decision-logs/P95-10                               702.5µ ± 2%            220.5µ ± 14%  -68.61% (p=0.000 n=10)
WithPercentiles/minimal-with-decision-logs/P99-10                             15469.8µ ± 1%            543.8µ ± 12%  -96.48% (p=0.000 n=10)
WithPercentiles/minimal-with-decision-logs/P99.9-10                            16.143m ± 1%            1.499m ± 17%  -90.72% (p=0.000 n=10)
geomean                                                                         1.006m                 156.8µ        -84.41%

                                              │ minimal-decision-logs-percentiles-old │ minimal-decision-logs-percentiles-buffer-impl │
                                              │                 B/op                  │         B/op           vs base                │
WithPercentiles/minimal-with-decision-logs-10                            68.50Ki ± 0%            44.69Ki ± 0%  -34.76% (p=0.000 n=10)

                                              │ minimal-decision-logs-percentiles-old │ minimal-decision-logs-percentiles-buffer-impl │
                                              │               allocs/op               │       allocs/op        vs base                │
WithPercentiles/minimal-with-decision-logs-10                             1084.0 ± 0%              815.0 ± 0%  -24.82% (p=0.000 n=10)

filters/openpolicyagent: Add a sample benchmark with to measure perce…

9a735f4

…ntiles - Add benchmark for latency distribution with decision logging - Fix a missed benchmark test case for race conditions Signed-off-by: Farasath Ahamed <[email protected]>

mefarazath marked this pull request as draft March 18, 2025 08:07

mefarazath commented Mar 18, 2025

View reviewed changes

mefarazath requested review from AlexanderYastrebov and mjungsbluth March 18, 2025 08:16

mefarazath added the minor no risk changes, for example new filters label Mar 18, 2025

mefarazath changed the title ~~filters/openpolicyagent: Add a sample benchmark with to measure percentiles~~ filters/openpolicyagent: Add a sample benchmark to measure latency percentiles Mar 18, 2025

mefarazath mentioned this pull request Mar 18, 2025

Improve opa filter benchmarks #3278

Closed

mefarazath mentioned this pull request Mar 18, 2025

Make latency impact of decision logs predictable open-policy-agent/opa#5724

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filters/openpolicyagent: Add a sample benchmark to measure latency percentiles #3442

filters/openpolicyagent: Add a sample benchmark to measure latency percentiles #3442

mefarazath commented Mar 18, 2025

mefarazath Mar 18, 2025 •

edited

Loading

AlexanderYastrebov Mar 20, 2025 •

edited

Loading

AlexanderYastrebov Mar 20, 2025

mefarazath commented Mar 18, 2025

filters/openpolicyagent: Add a sample benchmark to measure latency percentiles #3442

Are you sure you want to change the base?

filters/openpolicyagent: Add a sample benchmark to measure latency percentiles #3442

Conversation

mefarazath commented Mar 18, 2025

mefarazath Mar 18, 2025 • edited Loading

Choose a reason for hiding this comment

AlexanderYastrebov Mar 20, 2025 • edited Loading

Choose a reason for hiding this comment

AlexanderYastrebov Mar 20, 2025

Choose a reason for hiding this comment

mefarazath commented Mar 18, 2025

mefarazath Mar 18, 2025 •

edited

Loading

AlexanderYastrebov Mar 20, 2025 •

edited

Loading