Skip to content
This repository has been archived by the owner on Mar 2, 2022. It is now read-only.

Allow rotation between multiple bandwidth servers #57

Open
teor2345 opened this issue Jan 23, 2018 · 4 comments
Open

Allow rotation between multiple bandwidth servers #57

teor2345 opened this issue Jan 23, 2018 · 4 comments

Comments

@teor2345
Copy link
Collaborator

There is one hard-coded bandwidth server.
We need multiple bandwidth servers, so that measurements aren't biased towards one location.

@aagbsn
Copy link
Collaborator

aagbsn commented Jan 23, 2018

Measurement endpoints in the original bandwidth scanner design are set up per-authority, so each operator also has a different test endpoint. That's the intent here, with the baked in URL (https://bwauth.torproject.org) as a default fallback for testing, etc.
However, one proposal is to create a 'loop' circuits back to the bandwidth scanner, which is also running a tor relay (with ExitPolicy Accept to the test endpoint), see the "there and back again" circuit generator and https://trac.torproject.org/projects/tor/ticket/9762. Thoughts?

@teor2345
Copy link
Collaborator Author

teor2345 commented Jan 23, 2018 via email

@aagbsn
Copy link
Collaborator

aagbsn commented Jan 24, 2018

The bandwidth scanning process shouldn't depend on anonymity of the scanners or endpoints because this is hard to guarantee. Ideally a malicious exit relay shouldn't be able to learn the endpoints AND selectively limit throughput to influence other relays measurements that are in the same circuit. I.e. a circuit that looks like:

bwscanner client ---> measured relay ---> malicious exit ---> known measurement endpoint

means that a malicious exit relay operator could collude with guard relays and bias measurement results. I believe this is made worse by the 'slice' approach of the current scanners, because controlling enough relays in the manner above can probably be used to block new relays from being fairly measured and rising in rank via a kind of 'consensus wall'. And I suspect this might happen as a side effect of circuits built between relays that coincidentally happen to be in the same datacenter because connections will scale thoughput faster and bias measurements on circuits that are geographically closer.

And any relay that is aware it's being measured can do whatever it can in order to improve its measurement results (ie drop every other cell that isn't part of a measurement circuit). I'm not sure what we can do about this type of biasing, though. I think this is the sort of thing that Peerflow can mitigate because measurements are the result of passive observations from the rest of the network rather than active probes.

So a scanner process that looks like the this:

bwscanner client ---> local relay ---> measured relay ---> endpoint exit ---> endpoint local to exit
(e.g. an exit with exitpolicy only to the measurement endpoint).

Should mean that a malicious relay can only influence results that it is also part of. An active network adversary who is able to attack network infrastructure (e.g. the scanner's ISP or upstreams) can of course still degrade measurements towards relays that it wishes to degrade with this approach.

The geographical biases are also difficult to address if we continue to run a small set of scanners (i.e. one or less per dirauth) as most of the scanners are located in US or Europe - so even moving the test endpoints to a CDN will still leave the scanners wherever they may be. So you'll see biases that arise from a relay being near a bwscanner and a cdn node - all of which might be in the same datacenter!

I guess it's obvious that using a single CDN hands a lot of power to the CDN operator, too.

The redesigned scanner was built with the possibility of 'sharding' the scans across a set of scanners - see circuit.TwoHop and arguments "partitions, this_partition". The intent here was to be able to run multiple scanners in parallel on different machines in order to better scale (i.e. reduce the time to complete a scan of the entire network). For example, a bandwidth authority operator could use a cloud computing service to spin up nodes for the duration of the scan and then combine the results. That might make attacking bandwidth scanner infrastructure harder because endpoints won't be defined statically, though exits with single line exitpolicy towards a test endpoint are going to stand out.

Single onion services make a lot of sense because we can avoid the requirement of using an exit in the measurement path, so a circuit can look something like:

bwscanner client ---> bwscanner local relay ---> measured relay ---> bwscanner local relay single onion endpoint

There's a lot of speculation above :) - Thoughts?

P.S. there are other ways to measure performance that could be used for feedback rather than bandwidth measurements - e.g. circuit failure rates, extend latency, particularly for high bandwidth relays that may be more CPU constrained than bandwidth limited.

@teor2345
Copy link
Collaborator Author

teor2345 commented Jan 25, 2018 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants