Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Light Web Node panic after restart #3101

Open
Albermonte opened this issue Nov 19, 2024 · 5 comments
Open

Light Web Node panic after restart #3101

Albermonte opened this issue Nov 19, 2024 · 5 comments

Comments

@Albermonte
Copy link
Member

Happens after restarting the node (killing the app and starting it again)

Env:

  • "@nimiq/core": "2.0.0"
  • nodejs v20.10.0
2024-11-19T21:22:04.126000128Z WARN  lib                  | Unable to set event listener for 'message' event: JsValue(TypeError: getObject(...).addEventListener is not a function
TypeError: getObject(...).addEventListener is not a function
    at /home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:2283:21
    at handleError (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:253:18)
    at module.exports.__wbg_addEventListener_717681e3b948fb96 (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:2282:78)
    at wasm://wasm/0193b6e2:wasm-function[131]:0x52375
    at wasm://wasm/0193b6e2:wasm-function[2866]:0x324586
    at wasm://wasm/0193b6e2:wasm-function[2252]:0x3095f0
    at wasm://wasm/0193b6e2:wasm-function[4783]:0x35075c
    at __wbg_adapter_58 (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:248:10)
    at real (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:229:20)
    at node:internal/process/task_queues:140:7)

[2024-11-19T21:22:04.168Z] Have client worker remote
[2024-11-19T21:22:04.169Z] Setup Head Change
2024-11-19T21:21:58.583000064Z INFO  lib                  | Web config web_config=ClientConfiguration { network_id: MainAlbatross, seed_nodes: ["/dns4/aurora.seed.nimiq.com/tcp/443/wss", "/dns4/catalyst.seed.nimiq.network/tcp/443/wss", "/dns4/cipher.seed.nimiq-network.com/tcp/443/wss", "/dns4/eclipse.seed.nimiq.cloud/tcp/443/wss", "/dns4/lumina.seed.nimiq.systems/tcp/443/wss", "/dns4/nebula.seed.nimiq.com/tcp/443/wss", "/dns4/nexus.seed.nimiq.network/tcp/443/wss", "/dns4/polaris.seed.nimiq-network.com/tcp/443/wss", "/dns4/photon.seed.nimiq.cloud/tcp/443/wss", "/dns4/pulsar.seed.nimiq.systems/tcp/443/wss", "/dns4/quasar.seed.nimiq.com/tcp/443/wss", "/dns4/solstice.seed.nimiq.network/tcp/443/wss", "/dns4/vortex.seed.nimiq.cloud/tcp/443/wss", "/dns4/zenith.seed.nimiq.systems/tcp/443/wss"], log_level: "info" }

2024-11-19T21:21:58.584999936Z INFO  lib                  | Final configuration config=ClientConfig { network: NetworkConfig { listen_addresses: [], advertised_addresses: None, user_agent: UserAgent("core-rs-albatross/1.0.0 (native;  wasm32)"), seeds: [Seed { address: /dns4/aurora.seed.nimiq.com/tcp/443/wss }, Seed { address: /dns4/catalyst.seed.nimiq.network/tcp/443/wss }, Seed { address: /dns4/cipher.seed.nimiq-network.com/tcp/443/wss }, Seed { address: /dns4/eclipse.seed.nimiq.cloud/tcp/443/wss }, Seed { address: /dns4/lumina.seed.nimiq.systems/tcp/443/wss }, Seed { address: /dns4/nebula.seed.nimiq.com/tcp/443/wss }, Seed { address: /dns4/nexus.seed.nimiq.network/tcp/443/wss }, Seed { address: /dns4/polaris.seed.nimiq-network.com/tcp/443/wss }, Seed { address: /dns4/photon.seed.nimiq.cloud/tcp/443/wss }, Seed { address: /dns4/pulsar.seed.nimiq.systems/tcp/443/wss }, Seed { address: /dns4/quasar.seed.nimiq.com/tcp/443/wss }, Seed { address: /dns4/solstice.seed.nimiq.network/tcp/443/wss }, Seed { address: /dns4/vortex.seed.nimiq.cloud/tcp/443/wss }, Seed { address: /dns4/zenith.seed.nimiq.systems/tcp/443/wss }], tls: None, desired_peer_count: 6, peer_count_max: 50, peer_count_per_ip_max: 10, peer_count_per_subnet_max: 10, only_secure_ws_connections: true, allow_loopback_addresses: false, dht_quorum: None }, consensus: ConsensusConfig { sync_mode: Light, min_peers: 3, max_epochs_stored: 1, full_sync_threshold: 10800, index_history: true }, network_id: MainAlbatross, storage: Volatile, zk_prover: None }

2024-11-19T21:21:58.584999936Z INFO  lib                  | Initializing light client

2024-11-19T21:21:58.590000128Z INFO  client               | Identity public key: PublicKey { publickey: Ed25519(PublicKey(compressed): cd9f6f974fa16ce260172c4691e52574d3d8f6d39e6c5fd0178d1f4e24d98a) }

2024-11-19T21:21:58.590000128Z INFO  client               | PeerId: 12D3KooWPf2kcaV4eysSRq3oJRU1L8EYdRhGuDbvo4NeYbXRVmtm

2024-11-19T21:21:58.590000128Z INFO  client               | Client configured as a light node

2024-11-19T21:22:04.064000000Z INFO  libp2p_swarm         | local_peer_id=12D3KooWPf2kcaV4eysSRq3oJRU1L8EYdRhGuDbvo4NeYbXRVmtm

2024-11-19T21:22:04.124999936Z INFO  lib                  | Web client initialized

2024-11-19T21:22:04.126000128Z INFO  lib                  | Spawning consensus

2024-11-19T21:22:04.521999872Z INFO  swarm                | Peer joined peer_id=12D3KooWEn8XccNwf4XpXN1idzLFc8QjocSjFVA3nrueqpRKDn8k peer_address=/dns4/quasar.seed.nimiq.com/tcp/443/wss

2024-11-19T21:22:04.535000064Z INFO  sync                 | Requesting zkp from peer peer_id=12D3KooWEn8XccNwf4XpXN1idzLFc8QjocSjFVA3nrueqpRKDn8k

2024-11-19T21:22:20.892000000Z INFO  swarm                | Connection closed with peer connection_id=1 peer_id=12D3KooWEn8XccNwf4XpXN1idzLFc8QjocSjFVA3nrueqpRKDn8k endpoint=Dialer { address: /dns4/quasar.seed.nimiq.com/tcp/443/wss, role_override: Dialer, port_use: Reuse } connections=0

2024-11-19T21:22:20.892000000Z ERROR swarm                | Failed to send request to peer request_id=1 peer_id=12D3KooWEn8XccNwf4XpXN1idzLFc8QjocSjFVA3nrueqpRKDn8k error=Connection was closed before a response was received

2024-11-19T21:22:21.086000128Z INFO  swarm                | Peer joined peer_id=12D3KooWA7MYg6RpbCcZTReWBwcDQxWVXkxt9sBtJiCia3rzv6Za peer_address=/dns4/nimiq.surf/tcp/8443/wss/p2p/12D3KooWA7MYg6RpbCcZTReWBwcDQxWVXkxt9sBtJiCia3rzv6Za

2024-11-19T21:22:21.095000064Z INFO  sync                 | Requesting zkp from peer peer_id=12D3KooWA7MYg6RpbCcZTReWBwcDQxWVXkxt9sBtJiCia3rzv6Za

2024-11-19T21:22:21.209999872Z INFO  swarm                | Peer joined peer_id=12D3KooWNGvsdHNQHejuqAsSsb3A5Ev1AvMp5DJB7Vqn3bzw62mS peer_address=/dns4/seed-3.pos.nimiq.com/tcp/443/wss/p2p/12D3KooWNGvsdHNQHejuqAsSsb3A5Ev1AvMp5DJB7Vqn3bzw62mS

2024-11-19T21:22:21.211000064Z INFO  sync                 | Requesting zkp from peer peer_id=12D3KooWNGvsdHNQHejuqAsSsb3A5Ev1AvMp5DJB7Vqn3bzw62mS

Error: called `Result::unwrap_throw()` on an `Err` value
    at module.exports.__wbindgen_throw (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:2423:11)
    at wasm://wasm/0193b6e2:wasm-function[5239]:0x353c01
    at wasm://wasm/0193b6e2:wasm-function[2043]:0x2fe26a
    at wasm://wasm/0193b6e2:wasm-function[160]:0xb81cb
    at wasm://wasm/0193b6e2:wasm-function[5003]:0x3523fe
    at wasm://wasm/0193b6e2:wasm-function[2866]:0x324586
    at wasm://wasm/0193b6e2:wasm-function[2252]:0x3095f0
    at wasm://wasm/0193b6e2:wasm-function[4783]:0x35075c
    at __wbg_adapter_58 (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:248:10)
    at real (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:229:20)
[2024-11-19T21:22:27.431Z] Error: called `Result::unwrap_throw()` on an `Err` value
    at module.exports.__wbindgen_throw (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:2423:11)
    at wasm://wasm/0193b6e2:wasm-function[5239]:0x353c01
    at wasm://wasm/0193b6e2:wasm-function[2043]:0x2fe26a
    at wasm://wasm/0193b6e2:wasm-function[160]:0xb81cb
    at wasm://wasm/0193b6e2:wasm-function[5003]:0x3523fe
    at wasm://wasm/0193b6e2:wasm-function[2866]:0x324586
    at wasm://wasm/0193b6e2:wasm-function[2252]:0x3095f0
    at wasm://wasm/0193b6e2:wasm-function[4783]:0x35075c
    at __wbg_adapter_58 (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:248:10)
    at real (/home/nimiq/Nimiq-Sunset-Cyberspace-Backend/node_modules/@nimiq/core/nodejs/worker-wasm/index.js:229:20)
2024-11-19T21:22:27.423000064Z ERROR panic                | thread '<unnamed>' panicked at 'already borrowed: BorrowMutError': /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wasm-bindgen-futures-0.4.45/src/task/singlethread.rs:102  
@Albermonte Albermonte changed the title Light Web Node panic Light Web Node panic after restart Nov 19, 2024
@sisou
Copy link
Member

sisou commented Nov 20, 2024

Duplicate of #2875, although in NodeJS this is an "Uncaught exception" and exits the process if not handled in userland.

@fiaxh
Copy link
Contributor

fiaxh commented Nov 22, 2024

Strack traces for both panics, and one additional one that I see alongside it:

index_bg.wasm:0x1adfe72 2024-11-22T02:52:12.673999872Z ERROR panic                | thread '<unnamed>' panicked at 'already borrowed: BorrowMutError': 

Image

Uncaught Error: called `Result::unwrap_throw()` on an `Err` value (/home/user/.cargo/registry/src/index.crates.io-6f17d22bba15001f/gloo-timers-0.2.6/src/future.rs:104:60): Canceled
    at imports.wbg.__wbindgen_throw (index.js:2190:15)

Image

Uncaught RuntimeError: unreachable

Image

@hrxi
Copy link
Contributor

hrxi commented Nov 22, 2024

If I understand it correctly, Rust panics in wasm do not trigger stack unwinding, so destructors aren't called. This means one shouldn't call into a wasm blob after it has panicked. The already borrowed: BorrowMutError sounds like one that could result from calling into a wasm blob that has already panicked.

Thus, I think, it makes sense to focus on the other panics. Unfortunately, the unwrap_throw panic that appears in the first post deliberately strips more information from the panic when compiled without debug information. @fiaxh's stack trace shows that the panic happens in gloo-timers-0.2.6/src/future.rs:104:60, https://docs.rs/crate/gloo-timers/0.2.6/source/src/future.rs#104. Earlier in the file, we can find a suspicious comment:

            // if the receiver was dropped we do nothing.
            tx.send(()).unwrap_throw();

@sisou
Copy link
Member

sisou commented Nov 22, 2024

That code is from line 72, though. Line 104 is this code:

103    fn poll(mut self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
104        Future::poll(Pin::new(&mut self.rx), cx).map(|t| t.unwrap_throw())
105    }

https://github.com/rustwasm/gloo/blob/d600d3ac934b1c62b7009c22eabe2f91be4bd974/crates/timers/src/future.rs#L103-L105

@hrxi
Copy link
Contributor

hrxi commented Nov 22, 2024

Yes, it's the same code, but without the comment.

Digging deeper, it was introduced by rustwasm/gloo#98 (comment) (similar comment on the one with the suspicious code comment: rustwasm/gloo#98 (comment)).

We're in the TimeoutFuture::poll function. The returned error is a Canceled, which indicates that the connected sender has been dropped. The connected sender is stored in self._inner.closure.data and is only dropped together with the TimeoutFuture itself, so there should be no code path leading to this panic.

I'm again confused and not sure where to continue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants