Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenTelemetry instrumentation #30130

Merged
merged 1 commit into from May 10, 2024
Merged

Add OpenTelemetry instrumentation #30130

merged 1 commit into from May 10, 2024

Conversation

renchap
Copy link
Sponsor Member

@renchap renchap commented Apr 30, 2024

This is based on the work from #20338 and #29016.

We want to deploy this to our staging environment to test how it behaves, and we can only build containers from PRs if the branch is part of the main repository, so I re-created the PR.

It mixes the good work from @robbkidd @julianocosta89

I went with use_all as advised by @robbkidd, if this load only the required gems then it should be equivalent, and is less verbose. I kept the documentation on how to disable a specific instrumentation.

The gems are now in a specific group, allowing people who do not want to install the OTEL gems to use bundle --without opentelemetry.

Once we have been able to test this, I will add back the collector config & build files from #29016

@renchap renchap added the build-image Build a container image for this PR label Apr 30, 2024
@robbkidd
Copy link
Contributor

This looks like a good consolidation. I'll try it out locally!

Some things:

  1. I'll look at the ActiveRecord notification subscription and see if there are some OTel semantic convention attributes that can be added to the span created.
  2. There's a typo in my email for the Co-authored-by:; it should end in .org.

@renchap
Copy link
Sponsor Member Author

renchap commented Apr 30, 2024

Fixed, sorry. If needed, you can ping me on the Pollinators slack

I just pushed a small change to add the current version in the service.version attribute, this should enable version-tracking features.

We also probably need to not send traces for the healthcheck endpoint.

We enabled sending OTEL traces to our Datadog account through the collector, and the span names are not very useful. Will try to see how we can improve those
image

Gemfile Show resolved Hide resolved
@renchap
Copy link
Sponsor Member Author

renchap commented Apr 30, 2024

Pushed a new change which removes ActiveSupport, ActionPack and ActionView integrations, as well as the custom ActiveRecord one, because those are enabled by the Rails integration directly now.

@robbkidd
Copy link
Contributor

robbkidd commented Apr 30, 2024

Pushed a new change which removes ActiveSupport, ActionPack and ActionView integrations, as well as the custom ActiveRecord one, because those are enabled by the Rails integration directly now.

The opentelemetry-instrumentation-rails metapackage will bring in and require opentelemetry-instrumentation-active_record.

⚠️ When I first attempted adding OpenTelemetry, the direct ActiveRecord instrumentation (which uses prepending monkeypatches) interacted badly with attr_encrypted (which at the time used alias_method monkeypatches). The custom ActiveRecord subscription was to avoid these colliding monkeys.

I see attr_encrypted is still a dependency. It was upgraded to 4.0.0 since I last tested things, but 4.0.0 still uses alias_method.

I'll see if I see this bad interaction again locally!

@mjankowski
Copy link
Contributor

I see attr_encrypted is still a dependency.

There is an approved but not yet merged PR which removes it - #28325 - part of larger migration to native AR encryption.

@robbkidd
Copy link
Contributor

... part of larger migration to native AR encryption

❤️ This is a good and noble quest.

If the monkeypatches still conflict, the custom AS::Notification subscription can return as a temporary measure until native AR encryption lands.

@robbkidd
Copy link
Contributor

robbkidd commented Apr 30, 2024

📰 I'm running in docker compose locally with 41b1bff (opentelemetry-instrumentation-active_record in, custom AR subscription out) and I'm not seeing the infinite stack sadness of the colliding monkeypatches. So, that's a good sign.

@jgaskins
Copy link

jgaskins commented May 2, 2024

Running this branch on my instance (container built here) and seeing infinite recursion happening again. It's manifesting the same as before, in AccountStat#reload.

I haven't dug too deeply into it yet; I just got it built and deployed. Is it possible we still have conflicting monkeypatching strategies going on?

Screenshot of a trace on Honeycomb showing AccountStat reload calling itself recursively after an upsert

@mjankowski
Copy link
Contributor

I haven't fully read the entire previous thing or dug deep into the arel/otel interaction here ... but FYI the PR which removes attr_encrypted has now been merged, but this branch is not yet rebased after that merge, so I'd expect whatever interactions existed there to still be present on this branch.

@renchap
Copy link
Sponsor Member Author

renchap commented May 2, 2024

Rebased with the removal of attr_encrypted

@jgaskins
Copy link

jgaskins commented May 3, 2024

the PR which removes attr_encrypted has now been merged, but this branch is not yet rebased after that merge

Oooh, okay. I definitely misread the part up-thread.

I rebuilt the container with the rebase and deployed that and I haven't seen this issue since! Excellent work, folks!

This is based on the work from #20338 and #29016

Co-authored-by: Juliano Costa <[email protected]>
Co-authored-by: Robb Kidd <[email protected]>
Copy link
Contributor

@ClearlyClaire ClearlyClaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat uncomfortable with adding a large amount of dependencies, but this seems as self-contained as it gets, and the fact it is in an optional dependency group is appreciable.

@renchap renchap added this pull request to the merge queue May 10, 2024
Merged via the queue into main with commit 68b9fe8 May 10, 2024
54 checks passed
@renchap renchap deleted the opentelemetry branch May 10, 2024 12:45
@mjankowski
Copy link
Contributor

I recall previous discussion about the OpenTelemetry integration replacing the nsa gem statsd support -- #29065 (comment) -- on top of the initial work here, what things need to be done to enable that?

@renchap
Copy link
Sponsor Member Author

renchap commented May 10, 2024

I would say that you can now get all the information from the NSA gem using OTEL (+ sidekiq-exporter maybe), + many more useful performance data, so I would be inclined to drop the NSA gem entirely?

@mjankowski
Copy link
Contributor

Yeah ... I think that was the plan based on that original thread ... was just trying to confirm if we were there yet with just this PR, and it sounds like probably yes? I'll open PR to remove nsa and we can work through there if that's true.

@renchap
Copy link
Sponsor Member Author

renchap commented May 10, 2024

Need to think a bit about either deprecating NSA in 4.3, then removing in 4.4, or removing it in 4.3 directly

emilweth pushed a commit to rivals-space/mastodon that referenced this pull request May 11, 2024
Co-authored-by: Juliano Costa <[email protected]>
Co-authored-by: Robb Kidd <[email protected]>
@ClearlyClaire
Copy link
Contributor

We already deprecated StatsD integration in 4.2.0, but we had no replacement available at the time. Furthermore, it was kinda buried in the release notes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build-image Build a container image for this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants