Skip to content

Performance overhead of encoding/redacting #144

@PragTob

Description

@PragTob

👋

Not sure if this is a known issue but the performance overhead of the code in RedactorEncode is quite significant - even without any redactors as it traverses the entire data structure.

As in, I'd have expected the JSON encoding to be the most expensive formatting operation in the formatter - but in 2/3 scenarios the encoding & redacting (without any redactors) is ~twice as slow as the Jason encoding:

tobi@qiqi:~/github/elixir_playground(main)$ mix run bench/logger_json_overhead.exs 
Operating System: Linux
CPU Information: AMD Ryzen 9 5900X 12-Core Processor
Number of Available Cores: 24
Available memory: 31.26 GB
Elixir 1.17.3
Erlang 27.1
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 100 ms
time: 1 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: just a msg, some map, bigger_map
Estimated total run time: 9 s 900 ms

# ...

##### With input just a msg #####
Name                          ips        average  deviation         median         99th %
just Jason                 2.74 M      365.26 ns ±10272.07%         260 ns         590 ns
logger_json encode         1.36 M      737.97 ns  ±3307.02%         630 ns        1080 ns
whole logger format        0.48 M     2082.71 ns   ±297.05%        1980 ns        3510 ns

Comparison: 
just Jason                 2.74 M
logger_json encode         1.36 M - 2.02x slower +372.71 ns
whole logger format        0.48 M - 5.70x slower +1717.45 ns

##### With input some map #####
Name                          ips        average  deviation         median         99th %
logger_json encode       766.24 K        1.31 μs   ±527.69%        1.21 μs        1.84 μs
just Jason               407.51 K        2.45 μs   ±483.13%        1.74 μs        5.87 μs
whole logger format      196.30 K        5.09 μs   ±162.16%        4.40 μs       14.59 μs

Comparison: 
logger_json encode       766.24 K
just Jason               407.51 K - 1.88x slower +1.15 μs
whole logger format      196.30 K - 3.90x slower +3.79 μs

##### With input bigger_map #####
Name                          ips        average  deviation         median         99th %
just Jason               271.29 K        3.69 μs   ±142.04%        3.45 μs        7.59 μs
logger_json encode        99.97 K       10.00 μs    ±13.77%        9.80 μs       12.89 μs
whole logger format       64.56 K       15.49 μs    ±32.81%       14.57 μs       34.33 μs

Comparison: 
just Jason               271.29 K
logger_json encode        99.97 K - 2.71x slower +6.32 μs
whole logger format       64.56 K - 4.20x slower +11.80 μs

Full benchmark (with some more logger stuff) here: https://github.com/PragTob/elixir_playground/blob/main/bench/logger_json_overhead.exs


It's still fast enough (I think) - it just feels odd.

I haven't done a profile to see what actually eats the most time (my suspicion is that the String.valid? and String.printable? checks are expensive).

A solution might be a more lightweight formatter that is more "use at your own risk" (i.e. expects you to do the work previously if necessary: redact things yourself beforehand, doesn't convert tuples, check string validity) but I suppose that might be its own project then almost. I also don't know how often these situations occur :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions