Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify Histogram and ExpHistogram aggregation #2114

Closed
wants to merge 1 commit into from

Conversation

fraillt
Copy link
Contributor

@fraillt fraillt commented Sep 13, 2024

Fixes #
Introducing common abstraction solved several issues with ExpoHistogram:

  • added is_under_cardinality_limit
  • improved performance GREATLY +100%.

Also fixed broken benchmark for histogram collection.

Changes

Provided abstractions to unify Histogram and ExpoHistogram behaviour.
There's two new types:

  • trait Aggregator - provides aggregation specific implementation
  • struct AttributeSetAggregation - has two tasks:
    • select and update specific aggregator very efficiently. (very similar to current ValueMap implementation.)
    • provide functions to collect (and reset) all measurements for DataPoints.

Generally performance for Histogram is not changes much, only collection phase was improved 10-30% percent (the more data you have the bigger the improvement)

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@fraillt fraillt requested a review from a team September 13, 2024 13:02
Copy link

codecov bot commented Sep 13, 2024

Codecov Report

Attention: Patch coverage is 91.22257% with 28 lines in your changes missing coverage. Please review.

Project coverage is 78.3%. Comparing base (7ab5e0f) to head (96be7f2).

Files with missing lines Patch % Lines
.../src/metrics/internal/attribute_set_aggregation.rs 79.6% 21 Missing ⚠️
...pentelemetry-sdk/src/metrics/internal/histogram.rs 94.5% 4 Missing ⚠️
...-sdk/src/metrics/internal/exponential_histogram.rs 97.6% 3 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #2114     +/-   ##
=======================================
- Coverage   78.3%   78.3%   -0.1%     
=======================================
  Files        121     122      +1     
  Lines      20815   20803     -12     
=======================================
- Hits       16308   16294     -14     
- Misses      4507    4509      +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@utpilla
Copy link
Contributor

utpilla commented Sep 13, 2024

@fraillt I don't completely follow the need for unifying abstractions for the two histograms?

The regular histogram is already using the efficient lookup offered by ValueMap. If all we want is ExponentialHistogram to have an efficient lookup and respect cardinality limit, why can't we just update ExponentialHistogram to use ValueMap (similar to the regular Histogram)?

@fraillt
Copy link
Contributor Author

fraillt commented Sep 13, 2024

@fraillt I don't completely follow the need for unifying abstractions for the two histograms?

The regular histogram is already using the efficient lookup offered by ValueMap. If all we want is ExponentialHistogram to have an efficient lookup and respect cardinality limit, why can't we just update ExponentialHistogram to use ValueMap (similar to the regular Histogram)?

Good question.
I tried to look at how to improve ExponentialHistogram (it was lacking performances and features), but it wasn't possible to easily do that with existing ValueMap implementation.

  • I would have to add yet another method in AtomicTracker it already has update_histogram(bucket_size) (removed in this PR), so I would had to add yet another method update_exp_histogram(....).
  • also ValueMap has measure(measurement, attrs, index) method, the last argument is only for histograms, but for exp histogram I think i needed few more arguments...
  • I would be forced to store more data in HistogramDataPoint, because interface doesn't allow to pass in configuration
    I think that's the reason why ExpHistogram didnt use ValueMap, because it doesn't fit well...

So I had a choice:

  • either modify existing ValueMap/AtomicTracker abstractions (and all other aggregations with it) to somehow make it work with ExpHistogram, with the cost that it will not be efficient anyway.
  • or avoid refactoring existing code and come up with efficient abstraction for histograms.

I choose the other option :) What is nice about it, is that it should work perfectly fine with counter types as well!!!
Eventually, I would like to have one efficient/consistent abstraction for all metrics and I think it is achievable.
As a comparison you can compare Histogram implementation.

  • main branch - 352 lines of code
  • my branch - 201 lines of code
    Not saying that lines of code is good or bad, but in my opinion this new interface abstracted a lot of details and now you can focus on core functionality instead.

@cijothomas
Copy link
Member

@fraillt Thanks for continuing your great work! I'd suggest to keep the scope of each PR very small, so it is easy to review and share feedbacks. (Metrics hot path code reviews need lot of focused attention, so smaller PRs are really appreciated here.)

Also, we'd like to keep ExponentialHistogram and ExplicitHistogram separate, and let ExponentialHistogram pickup ValueMap for its own improved speed. ExponentialHistogram is a relatively new entrant, and we need to do extensive testing before marking it stable. I highly doubt if we can do that by 1.0, so separating it is preferred.

if you can join our community meetings (Tuesdays 9 AM Pacific), that'd be a good time to discuss some of the design ideas/other improvements! Most of the meeting time is spent discussing non-trivial problems, and you can definitely help there too, if you have the bandwidth to join. (If the time is not friendly to your time zone, we can do ad-hoc meetings to allow more participations)

@fraillt
Copy link
Contributor Author

fraillt commented Sep 14, 2024

@cijothomas Would be interesting to join community meeting :) Timing is alright as well (19 PM EEST my time).

@fraillt fraillt mentioned this pull request Sep 14, 2024
4 tasks
@fraillt
Copy link
Contributor Author

fraillt commented Sep 14, 2024

I'll close this, because as you pointed out, this might be too big to review.
I have created #2117 that uses same new Aggregator interface, but should be more easy to review, as I tried to make it as small as possible.
This revision might serve as an example what I want to do next, in smaller steps. Providing unified collect function is really important, as I have encountered several subtle bugs or redundant clones that probably happened due to copy/paste, so having unified collect function(s) could solve these inconsistencies, and provide performance improvements across all metrics.

@fraillt fraillt closed this Sep 14, 2024
@cijothomas
Copy link
Member

@cijothomas Would be interesting to join community meeting :) Timing is alright as well (19 PM EEST my time).

Thanks! See ya at the meeting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants