-
Notifications
You must be signed in to change notification settings - Fork 631
feat: Metrics SDK to offer dynamic memory resizing for delta aggregations #3308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3308 +/- ##
=====================================
Coverage 80.7% 80.7%
=====================================
Files 129 129
Lines 23471 23551 +80
=====================================
+ Hits 18956 19027 +71
- Misses 4515 4524 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| /// When false (default), HashMaps pre-allocate up to cardinality_limit. | ||
| /// This is useful when cardinality_limit is set high for rare spikes | ||
| /// but typical usage has far fewer attribute sets. | ||
| dynamic_memory_allocation_delta: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a concern about the current approach. While I understand this helps users who face higher memory usage just to accommodate rare spikes in volume, I'm worried this could be used as an escape hatch to effectively disable cardinality capping altogether. Users could set cardinality_limit to usize::MAX and enable dynamic allocation to bypass the intended safeguards.
We want users to think carefully about the cardinality of their metrics rather than resorting to workarounds that avoid that important design consideration.
That said, we also want to solve the legitimate problem of over-allocating memory for rare spikes. One option to balance these concerns would be to include "unsafe" (or similar warning language) in the public config option name, e.g., with_lazy_allocation_delta_unsafe() combined with clear documentation about:
- The intended use case (high cardinality limit with typically low actual cardinality)
- Why this shouldn't be used to bypass cardinality limits
This would make it explicit that this is an advanced option that requires careful consideration and discourage casual misuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried this could be used as an escape hatch to effectively disable cardinality capping altogether. Users could set cardinality_limit to usize::MAX and enable dynamic allocation [...]
This is exactly what I was trying to achieve when I filed the original bug report / fix. Please let users disable the cardinality limit if they want to.
[...] to bypass the intended safeguards
Perhaps users don't want safeguards :)
In our case, we're careful not to add labels which can have an arbitrary number of possible values (and thus explode). Some of the metrics have a cardinality of a few million, but the rest a normal cardinality of maybe 1-1000.
Since setting the cardinality limit is only a program level configuration option, setting the cardinality limit for individual meters is not possible. Even if it was, it would be a pain to configure, and you could possibly get the configuration wrong (which would be a disaster since you'd be silently losing metrics).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since setting the cardinality limit is only a program level configuration option, setting the cardinality limit for individual meters is not possible.
There is no program level configuration - cardinality limit is per metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no program level configuration - cardinality limit is per metric.
Interesting. Can you show me how to use the cardinality limit without using a view?
For example
let meter = opentelemetry::global::meter("my_library");
let counter = meter.u64_counter("my_metric").with_description("My metric").build();
I don't see any with_cardinality_limit or similar on InstrumentBuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Can you show me how to use the cardinality limit without using a view?
Cardinality limit is set using view only.
example: https://github.com/open-telemetry/opentelemetry-rust/blob/main/examples/metrics-advanced/src/main.rs#L25-L30
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Views are a program level configuration, hence my point.
The only option from what I can tell is matching on the instrument name, but that obviously doesn't scale/work for most users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what do you mean by that. CardinalityLimit can be set for each metric individually (yes, by matching on instrument name), not just a program wide limit. If you can share some pain points, I'd be happy to explore improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It creates a dependency between one part of your code (where you set up the view), to another part of your code (where the instrument is created). That's problematic enough.
Furthermore, as the creator of an instrument, for example in a library, you have no control over the cardinality limit. You have to rely on each of the programs relying on that library to know each and every name of every instrument, and the proper value to set the cardinality limit to. If you change the name of the instrument, you have to remember to change every program's view as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The library author has no control over the limit and this is by design. Cardinality is affected by temporality, export frequency etc., and instrumentation author should not be concerned with these.
Changing the name of the instrument is a breaking change. If libraries randomly keep changing metrics name, you have bigger problems than just view not matching the name!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cardinality is affected by temporality, export frequency etc
Not necessarily? For my use case, it's known with 100% precision beforehand what the cardinality will be, and it's way higher than the default of 2000. At instrument creation time, I should be able to specify at least a minimum cardinality limit.
Addresses #3290 (comment) , reverting the original fix.