-
-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent starting Vernier in nested transactions #2528
Prevent starting Vernier in nested transactions #2528
Conversation
f4e835c
to
2b53dd7
Compare
|
||
unless @hub.profiler_running? | ||
@profiler = @configuration.profiler_class.new(@configuration) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thing right here is the heart of the fix, if I remove this conditional the spec I added easily reproduces the crash from the issue:
Failures:
1) Sentry transactions / profiler starts profiler just once inside nested transactions
Failure/Error: @stack_table_hash ||= profile._stack_table.to_h
NoMethodError:
undefined method `_stack_table' for nil
# ./lib/sentry/vernier/output.rb:85:in `stack_table_hash'
# ./lib/sentry/vernier/output.rb:57:in `frames'
# ./lib/sentry/vernier/output.rb:22:in `to_h'
# ./lib/sentry/vernier/profiler.rb:100:in `to_hash'
# ./lib/sentry/transaction_event.rb:65:in `populate_profile'
# ./lib/sentry/transaction_event.rb:40:in `initialize'
# ./lib/sentry/client.rb:174:in `new'
# ./lib/sentry/client.rb:174:in `event_from_transaction'
# ./lib/sentry/transaction.rb:267:in `finish'
# ./spec/sentry/transactions/profiler_spec.rb:35:in `block (3 levels) in <top (required)>'
# ./spec/sentry/transactions/profiler_spec.rb:18:in `block (2 levels) in <top (required)>'
# /usr/local/rvm/gems/default/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:124:in `block in run'
# /usr/local/rvm/gems/default/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `run'
# /usr/local/rvm/gems/default/gems/rspec-retry-0.6.2/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
# /usr/local/rvm/gems/default/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:37:in `block (2 levels) in setup'
Finished in 0.02769 seconds (files took 0.20007 seconds to load)
1 example, 1 failure
Failed examples:
rspec ./spec/sentry/transactions/profiler_spec.rb:17 # Sentry transactions / profiler starts profiler just once inside nested transactions
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2528 +/- ##
==========================================
- Coverage 64.45% 64.11% -0.34%
==========================================
Files 123 123
Lines 4707 4724 +17
==========================================
- Hits 3034 3029 -5
- Misses 1673 1695 +22
|
@@ -257,7 +261,7 @@ def finish(hub: nil, end_timestamp: nil) | |||
@name = UNLABELD_NAME | |||
end | |||
|
|||
@profiler.stop | |||
@hub.stop_profiler!(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the hub is responsible for starting a profiler, making this change actually felt like an improvement. Transactions are coupled to hub anyway.
I think more of profiler related logic can actually be moved out of |
Something like this? diff --git a/sentry-ruby/lib/sentry/hub.rb b/sentry-ruby/lib/sentry/hub.rb
index 788e8902..a1343831 100644
--- a/sentry-ruby/lib/sentry/hub.rb
+++ b/sentry-ruby/lib/sentry/hub.rb
@@ -23,8 +23,17 @@ module Sentry
def start_profiler!(transaction)
MUTEX.synchronize do
- transaction.start_profiler!
- @current_profiler[transaction.__id__] = transaction.profiler
+ return if !@current_profiler.empty?
+ profiler = configuration.profiler_class.new(configuration)
+ profiler.set_initial_sample_decision(transaction.sampled)
+ profiler.start
+ @current_profiler[transaction.__id__] = profiler
+ end
+ end
+
+ def profiler_for(transaction)
+ MUTEX.synchronize do
+ @current_profiler[transaction.__id__]
end
end
@@ -34,12 +43,6 @@ module Sentry
end
end
- def profiler_running?
- MUTEX.synchronize do
- !@current_profiler.empty?
- end
- end
-
def new_from_top
Hub.new(current_client, current_scope)
end
diff --git a/sentry-ruby/lib/sentry/transaction.rb b/sentry-ruby/lib/sentry/transaction.rb
index 6fc2524b..8fc077e5 100644
--- a/sentry-ruby/lib/sentry/transaction.rb
+++ b/sentry-ruby/lib/sentry/transaction.rb
@@ -56,10 +56,6 @@ module Sentry
# @return [Hash]
attr_reader :contexts
- # The Profiler instance for this transaction.
- # @return [Profiler]
- attr_reader :profiler
-
def initialize(
hub:,
name: nil,
@@ -86,10 +82,6 @@ module Sentry
@contexts = {}
@measurements = {}
- unless @hub.profiler_running?
- @profiler = @configuration.profiler_class.new(@configuration)
- end
-
init_span_recorder
end
@@ -300,13 +292,16 @@ module Sentry
@contexts[key] = value
end
+ # The Profiler instance for this transaction.
+ # @return [Profiler]
+ def profiler
+ @hub.profiler_for(self)
+ end
+
# Start the profiler.
# @return [void]
def start_profiler!
- return unless profiler
-
- profiler.set_initial_sample_decision(sampled)
- profiler.start
+ @hub.start_profiler!(self)
end
# These are high cardinality and thus bad
diff --git a/sentry-ruby/spec/sentry/transactions/profiler_spec.rb b/sentry-ruby/spec/sentry/transactions/profiler_spec.rb
index 78bb385b..fb1efd7b 100644
--- a/sentry-ruby/spec/sentry/transactions/profiler_spec.rb
+++ b/sentry-ruby/spec/sentry/transactions/profiler_spec.rb
@@ -21,8 +21,6 @@ RSpec.describe Sentry, 'transactions / profiler', when: [:vernier_installed?, :r
ProfilerTest::Bar.bar
- expect(Sentry.get_current_hub.profiler_running?).to be(true)
-
expect(parent_transaction.profiler).to_not be_nil
expect(nested_transaction.profiler).to be_nil |
not for now since the profile is currently linked to the transaction. |
df9f4b3
to
4d00641
Compare
I'd prefer to not make any more changes than the minimum to get the issue fixed, especially that hub doesn't expose a method for finishing a transaction, and all code that finishes transactions uses Transaction#finish, so it's the best place to stop profiling. |
4d00641
to
b359bfd
Compare
I agree with both but I don't think my patch would change them? We can also just pass down the profiler to the transaction through a keyword argument if we want to actually retain a reference to it there. Also, under the current implementation there's a gap between If we only create the profiler in |
ok let's take a step back, is this
|
@sl0thentr0py so, I've been grinding my teeth when trying to make it work by checking Vernier. It stores an instance variable called diff --git sentry-ruby/lib/sentry/hub.rb sentry-ruby/lib/sentry/hub.rb
index 788e8902..8f9ed9e0 100644
--- sentry-ruby/lib/sentry/hub.rb
+++ sentry-ruby/lib/sentry/hub.rb
@@ -18,26 +18,6 @@ module Sentry
first_layer = Layer.new(client, scope)
@stack = [first_layer]
@last_event_id = nil
- @current_profiler = {}
- end
-
- def start_profiler!(transaction)
- MUTEX.synchronize do
- transaction.start_profiler!
- @current_profiler[transaction.__id__] = transaction.profiler
- end
- end
-
- def stop_profiler!(transaction)
- MUTEX.synchronize do
- @current_profiler.delete(transaction.__id__)&.stop
- end
- end
-
- def profiler_running?
- MUTEX.synchronize do
- !@current_profiler.empty?
- end
end
def new_from_top
@@ -120,7 +100,7 @@ module Sentry
sampling_context.merge!(custom_sampling_context)
transaction.set_initial_sample_decision(sampling_context: sampling_context)
- start_profiler!(transaction)
+ transaction.start_profiler!
transaction
end
diff --git sentry-ruby/lib/sentry/transaction.rb sentry-ruby/lib/sentry/transaction.rb
index 6fc2524b..591c116c 100644
--- sentry-ruby/lib/sentry/transaction.rb
+++ sentry-ruby/lib/sentry/transaction.rb
@@ -85,10 +85,7 @@ module Sentry
@effective_sample_rate = nil
@contexts = {}
@measurements = {}
-
- unless @hub.profiler_running?
- @profiler = @configuration.profiler_class.new(@configuration)
- end
+ @profiler = @configuration.profiler_class.new(@configuration)
init_span_recorder
end
@@ -261,7 +258,7 @@ module Sentry
@name = UNLABELD_NAME
end
- @hub.stop_profiler!(self)
+ @profiler.stop
if @sampled
event = hub.current_client.event_from_transaction(self)
diff --git sentry-ruby/lib/sentry/vernier/profiler.rb sentry-ruby/lib/sentry/vernier/profiler.rb
index 2818ae8d..4a2c77ec 100644
--- sentry-ruby/lib/sentry/vernier/profiler.rb
+++ sentry-ruby/lib/sentry/vernier/profiler.rb
@@ -8,13 +8,14 @@ module Sentry
module Vernier
class Profiler
EMPTY_RESULT = {}.freeze
+ MUTEX = Mutex.new
- attr_reader :started, :event_id, :result
+ attr_reader :event_id, :result
def initialize(configuration)
@event_id = SecureRandom.uuid.delete("-")
- @started = false
+ @vernier_collector = nil
@sampled = nil
@profiling_enabled = defined?(Vernier) && configuration.profiling_enabled?
@@ -51,38 +52,53 @@ module Sentry
log("Discarding profile due to sampling decision") unless @sampled
end
- def start
- return unless @sampled
- return if @started
+ def other_vernier_running?
+ MUTEX.synchronize do
+ @vernier_collector && vernier_collector_id != @vernier_collector.__id__
+ end
+ end
- @started = ::Vernier.start_profile
+ def this_vernier_running?
+ MUTEX.synchronize do
+ @vernier_collector && vernier_collector_id == @vernier_collector.__id__
+ end
+ end
- log("Started")
+ def vernier_collector_id
+ Vernier.instance_variable_get(:@collector).__id__ if Vernier.instance_variable_get(:@collector)
+ end
- @started
- rescue RuntimeError => e
- # TODO: once Vernier raises something more dedicated, we should catch that instead
- if e.message.include?("Profile already started")
- log("Not started since running elsewhere")
- else
- log("Failed to start: #{e.message}")
- end
+ def start
+ return if !@sampled || other_vernier_running?
+
+ @vernier_collector = ::Vernier.start_profile
+
+ log("Started with Vernier collector id: #{vernier_collector_id}")
+ # rescue RuntimeError => e
+ # # TODO: once Vernier raises something more dedicated, we should catch that instead
+ # if e.message.include?("Profile already started")
+ # log("Not started since running elsewhere")
+ # else
+ # log("Failed to start: #{e.message}")
+ # end
+ rescue
+ debugger
end
def stop
return unless @sampled
- return unless @started
- @result = ::Vernier.stop_profile
- @started = false
+ @result = ::Vernier.stop_profile if this_vernier_running?
log("Stopped")
- rescue RuntimeError => e
- if e.message.include?("Profile not started")
- log("Not stopped since not started")
- else
- log("Failed to stop Vernier: #{e.message}")
- end
+ # rescue RuntimeError => e
+ # if e.message.include?("Profile not started")
+ # log("Not stopped since not started")
+ # else
+ # log("Failed to stop Vernier: #{e.message}")
+ # end
+ ensure
+ @vernier_collector = nil
end
def active_thread_id
diff --git sentry-ruby/spec/sentry/transactions/profiler_spec.rb sentry-ruby/spec/sentry/transactions/profiler_spec.rb
index 78bb385b..5311fac6 100644
--- sentry-ruby/spec/sentry/transactions/profiler_spec.rb
+++ sentry-ruby/spec/sentry/transactions/profiler_spec.rb
@@ -15,19 +15,16 @@ RSpec.describe Sentry, 'transactions / profiler', when: [:vernier_installed?, :r
end
it 'starts profiler just once inside nested transactions' do
- 10.times do
+ 10000.times do
parent_transaction = Sentry.start_transaction(name: "parent")
nested_transaction = Sentry.start_transaction(name: "nested")
- ProfilerTest::Bar.bar
-
- expect(Sentry.get_current_hub.profiler_running?).to be(true)
-
- expect(parent_transaction.profiler).to_not be_nil
- expect(nested_transaction.profiler).to be_nil
+ ProfilerTest::Bar::Foo.foo(10)
nested_transaction.finish
parent_transaction.finish
+
+ expect(parent_transaction.profiler.to_hash).to_not be_empty
end
end
end
diff --git sentry-ruby/spec/support/profiler.rb sentry-ruby/spec/support/profiler.rb
index ee83cc4d..2a8803a1 100644
--- sentry-ruby/spec/support/profiler.rb
+++ sentry-ruby/spec/support/profiler.rb
@@ -3,8 +3,8 @@
module ProfilerTest
module Bar
module Foo
- def self.foo
- 1e6.to_i.times { 2**2 }
+ def self.foo(count = 1e6.to_i)
+ count.times { 2**2 }
end
end ...and it just doesn't work. Once we have more than one profile running and they try to interact with Vernier, things become unstable and we end up with errors. It passes really consistently when it's handled higher-up in the Hub. I also think it's good that we don't create a profiler when it's not going to be used. It seems to be the simplest thing that can possibly work. Can we just stick to this solution? |
could we ask the Vernier folks what the best way to do this is? Maybe they can expose it officially? |
hey @jhawthorn any chance you could chime in and help us out with this? Long story short is that we need to prevent our transactions from starting Vernier more than once, and it turned out to be a challenge. I put some guards in this PR that ensure that we do not create more than one Profiler instance, but we'd prefer if we could handle it in the profiler itself. I noticed there's |
5b6c609
to
c0d9002
Compare
@sl0thentr0py @st0012 I reverted my attempts at solving it at the transaction level and I still think having hub orchestrate this is the simplest thing to do and it works, unlike every other attempt at checking if Vernier is running. Could you revisit this please? Without fixing this Vernier profiler is not stable so I'd love to get the fix done. |
c0d9002
to
b94fb82
Compare
and let's please just merge this minimal change, a bigger refactor is not necessary for this very edge case scenario. Hub/profiler etc will change in the future anyway so I don't want us to spend time on refactors. |
Fixes #2525
I suspect this will also help with #2457