add proposal for external vpn data format #274

ainghazal · 2023-03-29T05:31:52Z

Checklist

I have read the contribution guidelines
reference issue for this pull request:
related ooni/probe-cli pull request:
If I changed a spec, I also bumped its version number and/or date

Description

This proposal contains a data format that would enable receiving data from external sources with compatible semantics to what we will be obtaining via the vpn experiments.

I'm hesitant whether the right form would not be to declare a new experiment directly - but I suspect the experiment itself should just extend the data format and deal with the mechanics of receiving, validating (and probably, verifying) submission, either direct or via aggregation from trusted sources. separating it so probably enables easier changes on the submission mechanism themselves.

Do note that an analog discussion needs to happen for the openvpn & wireguard experiments. The semantics is probably clearer there since we can probably refactor to a generalized vpn data format (for which openvpn or wireguard are just special cases), making the experiment then more naturally tied to provider-specific logic and implementations.

fortuna · 2023-05-05T23:31:36Z

I've been playing with Outline connectivity tests at
https://github.com/Jigsaw-Code/outline-internal-sdk/tree/main/x/outline-connectivity

The test consists in resolving a domain name with a DNS resolver over a proxy using Shadowsocks as the transport.
DNS resolution is nice because we can use the same test for TCP and UDP. Much simpler than fetching a page. This test works for any transport that I can use to connect to a TCP or UDP endpoint.

I get reports like:

{
"time":"2023-05-05T23:11:25Z",
"duration_ms":24,
"proxy":"[IP]:443",
"resolver":"8.8.8.8:53",
"proto":"tcp",
"prefix":"HTTP/1.1 ",
"error":{"op":"dial","msg":"connection refused"}
}

I believe that encapsulates all the info I need.
It has to be sanitized though. For the proxy we should keep the country and AS instead of server IP.

We don't really need the resolver. It only tells me whether the server supports IPv4 and IPv6 destinations.

For the errors, the operation that failed and the error seems to be good. The operation is "dial", "write" or "read". Ideally we would canonicalize the error somehow, but that's protocol and platform-dependent.

Prefix is an Outline-specific feature.

fortuna · 2023-05-15T13:21:54Z

Perhaps a better option is to break down the errors calling out each action explicitly:

{
  "error": {
    "connect" : "ok|timeout|connection refused|host unreachable|network unreachable", 
    "send": "ok|timeout|..." 
    "receive": "ok|timeout|..."
  }
}

Perhaps it makes sense to stick to the POSIX errors when possible:

In that case we could use the names of the errors:

"connect": "ECONNREFUSED"

For UDP, the errors we see on TCP connect may show up on receive instead.

If an action was not done, the error should be null or the entry missing, to differentiate absence of test from successes.

amircybersec · 2024-04-02T00:55:04Z

@ainghazal & @fortuna Reviving this discussion.

@ainghazal Can you please share some thoughts on what metrics or specific errors your openvpn & wireguard experiment would capture?

I wrote a document to discuss some high-level requirements of designing an end-to-end network error logging system, which includes some discussions on error format.

I believe the decision on the data format should be left to the client but a higher-level meta format can be defined to have fields such as report type and version which are read first during consumption such as the example provided here.

amircybersec · 2024-04-02T03:40:20Z

@ainghazal I just saw the other PR #293. I am going to review the information there to get a better understanding of the openvpn format.

ainghazal · 2024-04-03T14:16:08Z

@ainghazal I just saw the other PR #293. I am going to review the information there to get a better understanding of the openvpn format.

while it can be interesting to gather some of the semantics for the internal OpenVPN experiment, #293 is perhaps too tied to gather network traces to help diagnose the blocking of OpenVPN. The spec proposal in this issue is conceived to be used by almost any tunneling protocol, and to be injected externally (i.e., perhaps bypassing other abstractions assumed in use by the official OONI probes).

ainghazal · 2024-04-03T14:22:45Z

I believe the decision on the data format should be left to the client

my concern is that, in order to process data and store it in the database, we'd need to agree on a common structure, or at least to be able to version and track data submitted by a few clients we're interested in understanding.

but a higher-level meta format can be defined to have fields such as report type and version which are read first during consumption such as the example provided here.

my original proposal did suggest a "vpn-network-error", mostly to differentiate from http-like reports that we might also receive if clients are able to send also reports about web APIs being blocked. But perhaps we can only focus on the former for now:

https://github.com/ooni/spec/blob/2222f6fa5ad902c0b570f29d562a289ec9493c57/data-formats/df-010-vpnext.md#vpn-network-errors

add proposal for external data format

7e978e2

ainghazal requested review from bassosimone and hellais as code owners March 29, 2023 05:31

ainghazal changed the title ~~add proposal for external data format~~ add proposal for external vpn data format Mar 29, 2023

must

2222f6f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add proposal for external vpn data format #274

add proposal for external vpn data format #274

ainghazal commented Mar 29, 2023 •

edited

Loading

fortuna commented May 5, 2023 •

edited

Loading

fortuna commented May 15, 2023

amircybersec commented Apr 2, 2024

amircybersec commented Apr 2, 2024 •

edited

Loading

ainghazal commented Apr 3, 2024

ainghazal commented Apr 3, 2024

add proposal for external vpn data format #274

Are you sure you want to change the base?

add proposal for external vpn data format #274

Conversation

ainghazal commented Mar 29, 2023 • edited Loading

Checklist

Description

fortuna commented May 5, 2023 • edited Loading

fortuna commented May 15, 2023

amircybersec commented Apr 2, 2024

amircybersec commented Apr 2, 2024 • edited Loading

ainghazal commented Apr 3, 2024

ainghazal commented Apr 3, 2024

ainghazal commented Mar 29, 2023 •

edited

Loading

fortuna commented May 5, 2023 •

edited

Loading

amircybersec commented Apr 2, 2024 •

edited

Loading