Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement leak checker in daemon #7344

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

hulthe
Copy link
Contributor

@hulthe hulthe commented Dec 13, 2024


This change is Reviewable

Copy link

linear bot commented Dec 13, 2024

@hulthe hulthe requested a review from dlon December 13, 2024 13:52
Copy link
Member

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 24 files at r1.
Reviewable status: 8 of 24 files reviewed, 3 unresolved discussions (waiting on @hulthe)


mullvad-daemon/src/leak_checker/mod.rs line 140 at r2 (raw file):

            // Make sure the tunnel state didn't change while we were doing the leak test.
            // If that happened, then our results might be invalid.
            while let Ok(event) = self.events_rx.try_recv() {

Instead of this dance, I wonder if it would be possible to just tokio::spawn the leak check and in Task::run use select!:

select! {
    Some(TaskEvent::NewTunnelState(s)) = next_event => {
        pending_leak_test.abort();
        let TunnelStateTransition::Connected(tunnel) = &s else {
            continue;
        };
        pending_leak_test = tokio::spawn(start_test()).fuse();
    },

    // ...

    result = &mut pending_leak_test => { /* ... */ },
}

mullvad-daemon/Cargo.toml line 19 at r2 (raw file):

[dependencies]
anyhow = "*" # TODO: do we want this?
surge-ping = "0.8.0" # TODO: workspace dep?

Unused?


leak-checker/Cargo.toml line 20 at r2 (raw file):

futures.workspace = true
serde = { workspace = true, features = ["derive"] }
reqwest = { version = "0.12.9", default-features = false, features = ["json", "rustls-tls"] }

Should we gate am_i_mullvad behind a feature so we don't have to depend on reqwest? Do we plan on using that?


Cargo.toml line 87 at r2 (raw file):

    "talpid-wireguard",
    "tunnel-obfuscation",
    "wireguard-go-rs",

Should this include leak-checker?

Copy link
Contributor Author

@hulthe hulthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 24 files reviewed, 3 unresolved discussions (waiting on @dlon)


mullvad-daemon/src/leak_checker/mod.rs line 140 at r2 (raw file):

Previously, dlon (David Lönnhager) wrote…

Instead of this dance, I wonder if it would be possible to just tokio::spawn the leak check and in Task::run use select!:

select! {
    Some(TaskEvent::NewTunnelState(s)) = next_event => {
        pending_leak_test.abort();
        let TunnelStateTransition::Connected(tunnel) = &s else {
            continue;
        };
        pending_leak_test = tokio::spawn(start_test()).fuse();
    },

    // ...

    result = &mut pending_leak_test => { /* ... */ },
}

Rewrote it using select. Not sure if it's less dancey, but it's more correct at least :)

Spawning a task wasn't necessary though, selecting on the future directly works just fine, and it's implicitly aborted when we drop it.

Copy link
Contributor Author

@hulthe hulthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 24 files reviewed, 2 unresolved discussions (waiting on @dlon)


leak-checker/Cargo.toml line 20 at r2 (raw file):

Previously, dlon (David Lönnhager) wrote…

Should we gate am_i_mullvad behind a feature so we don't have to depend on reqwest? Do we plan on using that?

I'm leaning towards removing that module completely. And maybe also moving the leak-checker lib into the daemon.

Copy link
Member

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 13 files at r6, 6 of 16 files at r7, all commit messages.
Reviewable status: 15 of 26 files reviewed, 5 unresolved discussions (waiting on @hulthe)


mullvad-daemon/src/leak_checker/mod.rs line 140 at r2 (raw file):

Previously, hulthe (Joakim Hulthe) wrote…

Rewrote it using select. Not sure if it's less dancey, but it's more correct at least :)

Spawning a task wasn't necessary though, selecting on the future directly works just fine, and it's implicitly aborted when we drop it.

I was hoping that we could get away with handling self.events_rx.recv() in only one place, and in that place cancel any existing check whenever there was a state transition. 😞 Then you also wouldn't have needed this loop.

This is fine, though!


leak-checker/src/traceroute/platform/common.rs line 78 at r7 (raw file):

        let packet = &read_buf[..n];
        let result = parse_ipv4(packet)
            .map_err(|e| anyhow!("Ignoring packet: (len={n}, ip.src={source}) {e} ({packet:02x?})"))

We should probably not log any packet contents when this is ready.


mullvad-daemon/src/leak_checker/mod.rs line 195 at r7 (raw file):

    // get_default_route in route manager
    #[cfg(target_os = "macos")]
    let interface = todo!("get default interface");

This is just route_manager.get_default_routes() now


mullvad-daemon/src/leak_checker/mod.rs line 212 at r7 (raw file):

        let Ok(Some(route)) = talpid_routing::get_best_default_route(family) else {
            todo!("no best default route");

It probably makes sense to just fail and log here?


talpid-core/src/firewall/macos.rs line 432 at r7 (raw file):

                // TODO: do we need this?
                //rules.push(self.get_block_relay_rule(peer_endpoint)?);

Only if it currently leaks outside the tunnel. If it is able to reach the relay inside the tunnel, we don't want/need to block it.

Copy link
Member

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 16 files at r7, 1 of 15 files at r8.
Reviewable status: 14 of 33 files reviewed, 6 unresolved discussions (waiting on @hulthe)


windows/winfw/src/winfw/fwcontext.cpp line 319 at r8 (raw file):

	));

	ruleset.emplace_back(std::make_unique<baseline::PermitIcmpTtl>(relayClient));

We should revert this now.

@hulthe hulthe force-pushed the detect-leaks-and-inform-user-des-1332 branch from 03fa5eb to 087d0f2 Compare December 20, 2024 17:30
Copy link
Contributor Author

@hulthe hulthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 14 of 33 files reviewed, 5 unresolved discussions (waiting on @dlon)


windows/winfw/src/winfw/fwcontext.cpp line 319 at r8 (raw file):

Previously, dlon (David Lönnhager) wrote…

We should revert this now.

done

@hulthe hulthe force-pushed the detect-leaks-and-inform-user-des-1332 branch 2 times, most recently from dae74db to ff91326 Compare January 10, 2025 13:43
@hulthe hulthe requested a review from dlon January 13, 2025 11:51
Copy link
Contributor Author

@hulthe hulthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 43 files reviewed, 6 unresolved discussions (waiting on @dlon)


talpid-core/src/firewall/macos.rs line 432 at r7 (raw file):

Previously, dlon (David Lönnhager) wrote…

Only if it currently leaks outside the tunnel. If it is able to reach the relay inside the tunnel, we don't want/need to block it.

Done. Removed it since we're relying on ICMP now.


talpid-core/src/firewall/macos.rs line 302 at r15 (raw file):

            peer_endpoint,
            tunnel: Some(tunnel),
            ..

@dlon do you remember why I did this?

Copy link
Contributor Author

@hulthe hulthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 43 files reviewed, 5 unresolved discussions (waiting on @dlon)


leak-checker/src/traceroute/platform/common.rs line 78 at r7 (raw file):

Previously, dlon (David Lönnhager) wrote…

We should probably not log any packet contents when this is ready.

Done.

@hulthe hulthe force-pushed the detect-leaks-and-inform-user-des-1332 branch from fae0900 to 85dd641 Compare January 22, 2025 09:24
@hulthe hulthe force-pushed the detect-leaks-and-inform-user-des-1332 branch from 85dd641 to e721a81 Compare January 22, 2025 09:26
@hulthe hulthe marked this pull request as ready for review January 22, 2025 09:26
Copy link
Member

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 17 files at r7, 7 of 10 files at r9, 13 of 26 files at r10, 15 of 22 files at r11, 4 of 13 files at r12, 16 of 16 files at r13, 6 of 7 files at r14, 1 of 18 files at r15, 8 of 9 files at r16, 8 of 10 files at r17, all commit messages.
Reviewable status: 32 of 43 files reviewed, 3 unresolved discussions (waiting on @hulthe)


talpid-core/src/firewall/macos.rs line 302 at r15 (raw file):

Previously, hulthe (Joakim Hulthe) wrote…

@dlon do you remember why I did this?

Nope. The leak checker only runs while connected, right? So maybe this should be reverted

But you could test whether it breaks anything by setting NAT_WORKAROUND to true.


talpid-core/src/firewall/macos.rs line 332 at r17 (raw file):

        let no_nat_to_vpn_server = pfctl::NatRuleBuilder::default()
            .action(pfctl::NatRuleAction::NoNat)
            .to(peer_endpoint.endpoint.address.ip())

Is this relevant? I assume the difference is that the rule only applies if the port is set, which does seem correct (assuming the port isn't just ignored). Is it needed for the checker to work when NAT_WORKAROUND is true, though?


leak-checker/Cargo.toml line 30 at r10 (raw file):

[target.'cfg(windows)'.dependencies]
windows-sys.workspace = true

I think this is missing some features. Maybe it does not matter because talpid-windows adds these?

Copy link
Member

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This should probably also have a changelog entry. :)

Reviewed 1 of 26 files at r10, 4 of 13 files at r12, 1 of 9 files at r16.
Reviewable status: 38 of 43 files reviewed, 4 unresolved discussions (waiting on @hulthe)


mullvad-daemon/src/lib.rs line 1035 at r17 (raw file):

            ExcludedPathsEvent(update, tx) => self.handle_new_excluded_paths(update, tx).await,
            LeakDetected(leak_info) => {
                log::warn!("LEAK DETECTED! AAAH: {leak_info:?}");

Could this be reworded a bit? 😅


leak-checker/src/traceroute/unix/android.rs line 17 at r17 (raw file):

        ip_version: Ip,
    ) -> anyhow::Result<()> {
        // can't use the same method as desktop-linux here beacuse reasons

😄 I guess the reason is that SO_BINDTODEVICE requires root access or some capability?


mullvad-daemon/src/leak_checker/mod.rs line 36 at r17 (raw file):

pub trait LeakCheckerCallback: Send + 'static {
    fn on_leak(&mut self, info: LeakInfo) -> CallbackResult;

Should CallbackResult be removed? Seems like it's never actually used.


leak-checker/src/traceroute/windows.rs line 16 at r17 (raw file):

/// Implementation of traceroute using `ping.exe`
///
/// This monstrosity exists because the Windows firewall is not helpful enough to allow us to

👍


leak-checker/src/traceroute/windows.rs line 47 at r17 (raw file):

                .args(["-n", "1"]) // number of pings
                .args(["-w", &SEND_TIMEOUT.as_millis().to_string()])
                .args(["-S", &interface_ip.to_string()]) // bind to interface IP

Have you been able to test if this works for IPv6?


leak-checker/src/traceroute/windows.rs line 74 at r17 (raw file):

                .with_context(output_err)?;

            let ip: IpAddr = ip.parse().unwrap();

Would it be overly defensive to not unwrap here? 😅 Seems likely that it will always be a valid address

Copy link
Member

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 10 files at r17.
Reviewable status: 39 of 43 files reviewed, 4 unresolved discussions (waiting on @hulthe)


mullvad-daemon/Cargo.toml line 18 at r17 (raw file):

[dependencies]
anyhow = { workspace = true }

I don't mind introducing anyhow here at all, but I wonder if others may object? Especially since we use it only sparingly.

Also, arguably the library part of leak-checker should not rely on it, only its CLI

Copy link
Member

@dlon dlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 29 of 43 files reviewed, 5 unresolved discussions (waiting on @hulthe)


leak-checker/src/traceroute/unix/linux.rs line 46 at r17 (raw file):

        let raw_socket = socket.into_raw_fd();
        let std_socket = unsafe { std::net::UdpSocket::from_raw_fd(raw_socket) };

I believe you can use From here, without unsafe: std::net::UdpSocket::from(socket)

https://docs.rs/socket2/latest/socket2/struct.Socket.html#impl-From%3CSocket%3E-for-UdpSocket

Copy link
Contributor Author

@hulthe hulthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 20 of 43 files reviewed, 4 unresolved discussions (waiting on @dlon)


leak-checker/Cargo.toml line 30 at r10 (raw file):

Previously, dlon (David Lönnhager) wrote…

I think this is missing some features. Maybe it does not matter because talpid-windows adds these?

Good catch. I checked and we're only depending on windows_sys::Win32::NetworkManagement::Ndis::NET_LUID_LH from windows-sys and that item doesn't seem to be feature-gated, so I think we good.


leak-checker/src/traceroute/unix/android.rs line 17 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

😄 I guess the reason is that SO_BINDTODEVICE requires root access or some capability?

Some such reason, probably. I rewrote the comment to be slightly less vague


mullvad-daemon/Cargo.toml line 18 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

I don't mind introducing anyhow here at all, but I wonder if others may object? Especially since we use it only sparingly.

Also, arguably the library part of leak-checker should not rely on it, only its CLI

Since we are the consumers of this library, and since we don't currently care about what the errors are, I think it's unnecessary boilerplate.


mullvad-daemon/src/lib.rs line 1035 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

Could this be reworded a bit? 😅

hehe, i don't see the problem 😆


mullvad-daemon/src/leak_checker/mod.rs line 36 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

Should CallbackResult be removed? Seems like it's never actually used.

Very nice catch. I fixed it so that callbacks are cleaned up as intended


talpid-core/src/firewall/macos.rs line 332 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

Is this relevant? I assume the difference is that the rule only applies if the port is set, which does seem correct (assuming the port isn't just ignored). Is it needed for the checker to work when NAT_WORKAROUND is true, though?

This change was needed in a previous version of the leak checker, but it's not anymore. I just kept it because I think this is more correct


leak-checker/src/traceroute/windows.rs line 47 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

Have you been able to test if this works for IPv6?

Nope. I'll see if I can get a VM working with IPv6


leak-checker/src/traceroute/windows.rs line 74 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

Would it be overly defensive to not unwrap here? 😅 Seems likely that it will always be a valid address

Nice catch. Fixed.

Copy link
Contributor Author

@hulthe hulthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 20 of 43 files reviewed, 3 unresolved discussions (waiting on @dlon)


leak-checker/src/traceroute/unix/linux.rs line 46 at r17 (raw file):

Previously, dlon (David Lönnhager) wrote…

I believe you can use From here, without unsafe: std::net::UdpSocket::from(socket)

https://docs.rs/socket2/latest/socket2/struct.Socket.html#impl-From%3CSocket%3E-for-UdpSocket

Nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants