Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reporter: Improve stack hashing for better cachability #2994

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

brancz
Copy link
Member

@brancz brancz commented Oct 16, 2024

Previously we used the internal hashes of traces from within the opentelemetry-ebpf-profiler, however, that hash is the same whether we end up getting full symbols for eg. interpreted frames, which meant that we couldn't safely cache incomplete stacks.

This change introduces a separate hashing mechanism, that hashes the fully symbolized stacks, including whether a frame was symbolized or not, meaning this hash can safely be used to cache those stacks.

A relatively unscientific experiment with a test cluster shows roughly a 40-50% reduction in bytes sent over the wire for servers that perform caching.

Judging by profiling data (or lack thereof), this change does not appear to have any effect on the baseline CPU usage of the agent.

Screenshot 2024-10-16 at 14 00 21

Previously we used the internal hashes of traces from within the
opentelemetry-ebpf-profiler, however, that hash is the same whether we
end up getting full symbols for eg. interpreted frames, which meant that
we couldn't safely cache incomplete stacks.

This change introduces a separate hashing mechanism, that hashes the
fully symbolized stacks, including whether a frame was symbolized or
not, meaning this hash can safely be used to cache those stacks.

A relatively unscientific experiment with a test cluster shows roughly a
40-50% reduction in bytes sent over the wire for servers that perform
caching.

Judging by profiling data (or lack thereof), this change does not appear
to have any effect on the baseline CPU usage of the agent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant