-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split downloads across mirrors #27
Comments
What is the simplest solution? |
The simplest is to only use a single host. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As discussed in #22, Wikipedia has a limit of 2 concurrent connections and seems to rate limit each to about 4 MB/s. There are at least two mirrors of the Enterprise dumps.
For the fastest speeds, ideally we could share downloads between wikipedia and the mirrors, or even download different parts of the same file concurrently like
aria2c
.Unfortunately, none of the parallel downloaders I've seen allow setting connection limits per host (e.g. 2 for dumps.wikimedia.org, 4 for the rest).
So besides writing our own downloader, to respect the wikimedia limits we could:
The text was updated successfully, but these errors were encountered: