-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle retries using urllib3 Retry #44
base: master
Are you sure you want to change the base?
Conversation
Breaking changes: - `default_retry_after` has been replaced by `retry_backoff_factor` - Extended the default `retry_status_codes` with more codes
It stops at BACKOFF_MAX = 120, so we can let it increase a bit faster.
total=max_retries, | ||
backoff_factor=retry_backoff_factor, | ||
status_forcelist=retry_status_codes or [429, 500, 502, 503, 504], | ||
method_whitelist=frozenset(['GET', 'POST']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: method_whitelist
has been deprecated in urllib3 v1.26 and removed in its v2. And since requests pins urllib3 to a range which includes the latter, IMHO the newer idiom must now be used here:
method_whitelist=frozenset(['GET', 'POST']) | |
allowed_methods=frozenset(['GET', 'POST']) |
Otherwise, you'll get a
retry_adapter = requests.adapters.HTTPAdapter(max_retries=Retry(
TypeError: __init__() got an unexpected keyword argument 'method_whitelist'
assert retries.total == 99 | ||
assert retries.backoff_factor == 1.1234 | ||
assert retries.status_forcelist == (418,) | ||
assert retries.method_whitelist == frozenset(['POST', 'GET']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert retries.method_whitelist == frozenset(['POST', 'GET']) | |
assert retries.allowed_methods == frozenset(['POST', 'GET']) |
Per #43, this is my take at replacing the custom retry code with the tried and tested urllib3 Retry class. Benefits include that it also handles
Timeout
andConnectionError
, and that the sleep time increases for each retry. Tests pass, and I'm also testing the branch in a real world scenario now, where I need to harvest from a server which is quite unstable, and happens to produce both timeouts, connection errors and 500 errors during updates. It seems to work fine so far!A slight inconvenience of retrying upon
ConnectionError
, is that it will also retry if you misspell the hostname. Personally, I think the benefits of handling ConnectionError outweighs that incovenience, but it could be something to discuss. If the rest of the URL contains a typo, so that the server returns a 404, it will of course still raise an error immediately.Perhaps the major issue here is that the
default_retry_after
argument can no longer be supported. It has been replaced by a new argumentretry_backoff_factor
, to control the exponential backoff factor, but the replacement will be a breaking change since the old argument cannot be converted into the new one.Also note that I extended the default
retry_status_codes
with more codes. Since this is "OAI for humans", and humans do not necessarily know which errors to expect up front, I think it makes sense to include some more common error codes by default. But I can leave this extension out of this PR if it's controversial.Logging: If it's desirable to have logging, one possibility is to enable debug logging for the retry module:
to get log messages like these:
I'm not sure if this should be the responsibility of sickle or the user, though.