Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout::Error - unknown cause or caller stacktrace #764

Open
mattheworiordan opened this issue Aug 29, 2017 · 9 comments
Open

Timeout::Error - unknown cause or caller stacktrace #764

mattheworiordan opened this issue Aug 29, 2017 · 9 comments

Comments

@mattheworiordan
Copy link

I have recently started noticing Actors failing in our Celluloid service with the following error and stacktrace:

E, [2017-08-29T07:35:30.647265 #3370] ERROR -- : Actor crashed!
Timeout::Error: execution expired
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/mailbox.rb:63:in `sleep'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/mailbox.rb:63:in `wait'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/mailbox.rb:63:in `block in check'
  /app/vendor/bundle/ruby/2.4.0/gems/timers-4.1.2/lib/timers/wait.rb:15:in `block in for'
  /app/vendor/bundle/ruby/2.4.0/gems/timers-4.1.2/lib/timers/wait.rb:14:in `loop'
  /app/vendor/bundle/ruby/2.4.0/gems/timers-4.1.2/lib/timers/wait.rb:14:in `for'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/mailbox.rb:58:in `check'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/actor.rb:155:in `block in run'
  /app/vendor/bundle/ruby/2.4.0/gems/timers-4.1.2/lib/timers/group.rb:68:in `wait'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/actor.rb:152:in `run'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/actor.rb:131:in `block in start'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-essentials-0.20.5/lib/celluloid/internals/thread_handle.rb:14:in `block in initialize'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/actor/system.rb:78:in `block in get_thread'
  /app/vendor/bundle/ruby/2.4.0/gems/celluloid-0.17.3/lib/celluloid/group/spawner.rb:50:in `block in instantiate'

I am a bit confused as to what is causing this error as there is no stack trace that correlates with any calls from my app. Any advice on how to work out what could be causing this error would be hugely appreciated so that I can look into the root cause. I suspect it could be:

  • A block Actor
  • A call to sleep
@tarcieri
Copy link
Member

This is likely caused by something in your code raising Timeout::Error: Celluloid itself does not use timeout.rb as it is not thread safe.

However, Celluloid does propagate stack traces between "tasks", and it looks like the original stack trace may be getting lost somewhere in that propagation.

Narrowing down what in your code is using timeout.rb would be a good first step towards debugging this.

@mattheworiordan
Copy link
Author

This is likely caused by something in your code raising Timeout::Error

Ok, unfortunately that means it's another dependency raising that exception, but I really appreciate you confirming it's not used by Celluloid. That's a good start.

However, Celluloid does propagate stack traces between "tasks", and it looks like the original stack trace may be getting lost somewhere in that propagation.

Any ideas how that may happen?

Thanks for the super speedy response!

@tarcieri
Copy link
Member

Any ideas how that may happen?

It would be a bug in Celluloid if true

@mattheworiordan
Copy link
Author

FYI. I did track it down to an HTTP request that resulted in a Timeout::Error being raised. Not sure if there is anything I can do to help diagnose why the stacktrace was not passed on correctly?

@tarcieri
Copy link
Member

tarcieri commented Sep 7, 2017

I'm guessing it's because Timeout::Error is rooted in Exception and not StandardError, and (for good reasons) Celluloid does not rescue Exception

@tarcieri
Copy link
Member

Closing as stale. If you are still interested in this issue, please reopen it.

@mattheworiordan
Copy link
Author

Sure, you can close, but sadly this is not resolved. Celluloid overall is a fantastic gem, but plagued with edge cases that keep terminating our long running services, which of course goes against the idea that we fail early and recover - we fail and die. I appreciate I've not contributed a PR to fix the issue, so thanks for following up on this.

@tarcieri
Copy link
Member

@mattheworiordan thanks for your candid feedback! I agree Celluloid was what some might describe as "too magical", and bugs/failure modes like this somewhat awful to debug, which is perhaps why I soured on the entire thing.

I'm trying to visibly deprecate Celluloid from the Ruby ecosystem, but there are volunteers interested in supporting its existing userbase, so I'm trying to do some initial triage on open issues for them. Apologies on hastily closing this issue.

I'll go ahead and reopen it.

@tarcieri tarcieri reopened this Jan 15, 2019
@mattheworiordan
Copy link
Author

Thanks @tarcieri. I feel disappointed I cannot make the time to resolve the issue myself. Celluloid is conceptually beautiful. Sadly without wider community support, I appreciate deprecation is the responsible route forward. Sad indeed when it has done a great job and pushed forwards so many great concepts. Feel free to close this issue though. I’m going to start looking at alternatives, most likely moving away from the Ruby lang :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants