-
Notifications
You must be signed in to change notification settings - Fork 32
Panic hook can result in unexpected access to OCaml runtime from another thread #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interesting, there is certainly a lot that can go wrong with the combination of async, panics and OCaml exceptions! I don't know anything about how tokio manages panics and am not even sure where to start looking into this issue. Does disabling the ocaml-rs panic hook get you a more helpful backtrace? |
I'll try to come up with some repro later, chasing some other segfault at the moment 😵 |
A-a-and I stumbled upon this again...
Panic handlers are installed globally and affect all Rust threads, even those not currently (or even at all) running OCaml code, which leads to unsynchronized access to OCaml runtime, resulting in issues like in above valgrind report. Ideally I think there should be a thread-local stack of guards, which must exist only when we're inside OCaml code, or in a context where raising OCaml exception would be appropriate behavior. Panic handler should apply its current behavior iff there is an active guard in the stack on current thread, and do nothing otherwise. |
|
Quick idea: maybe runtime handle should include that panic hook guard, as where runtime handle is available, it is assumed that we're executing within OCaml context (called from OCaml via stubs or running on active OCaml domain thread). |
Something along these lines maybe use std::cell::Cell;
use std::panic;
use std::sync::Once;
thread_local! {
static GUARD_COUNT: Cell<usize> = Cell::new(0);
}
static INIT: Once = Once::new();
pub struct OcamlPanicGuard;
impl OcamlPanicGuard {
pub fn new() -> Self {
INIT.call_once(|| {
let original_hook = panic::take_hook();
panic::set_hook(Box::new(move |panic_info| {
if GUARD_COUNT.with(|count| count.get()) > 0 {
let err = panic_info.payload();
let msg = if let Some(s) = err.downcast_ref::<&str>() {
s.to_string()
} else if let Some(s) = err.downcast_ref::<String>() {
s.clone()
} else {
format!("{:?}", err)
};
ocaml::Error::raise_failure(&msg);
} else {
original_hook(panic_info);
}
}));
});
GUARD_COUNT.with(|count| count.set(count.get() + 1));
OcamlPanicGuard
}
}
impl Drop for OcamlPanicGuard {
fn drop(&mut self) {
GUARD_COUNT.with(|count| count.set(count.get() - 1));
}
} |
thanks for investigating this! i don't have a lot of time to work on a fix right now, but would be happy to review a PR if you come up with something. |
Thanks for looking into this! Will definitely try this out, but can't promise to do it soon. We've disabled panic hook so far and mitigated the problem, but having panic hook is actually nice and useful inside the binding functions. |
I'm spawning some tokio task from my
#[ocaml::func]
Rust function, that is called from OCaml. The task itself does not call any OCaml code inside, but when a panic happens inside that task, I see surprising errors in valgrind before the whole thing segfaults:Tokio runtime worker thread trying to call
caml_fatal_uncaught_exception
, what could possibly go wrong 😂I'm not yet sure on why this is happening, I see that ocaml-rs is setting up that panic hook in initial setup [1], probably rust takes the hook from the thread, that created tokio task?
[1] https://github.com/zshipko/ocaml-rs/blob/master/src/macros.rs#L62
The text was updated successfully, but these errors were encountered: