-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: retry on DNS queries. #1180
base: master
Are you sure you want to change the base?
Conversation
c44e405
to
b324723
Compare
Tested today, it does improve the situation. However, I would advice to also fail the process early here if a DNS error still happen despite the retry. |
This not needed because the DNSError don't produce a retry and by example a timeout can have several reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what the expected behaviour for --dns-timeout
ought to be. If I provide --dns-timeout=30
, I'd expect the SOA query to finish within 30s (give or take a few seconds). Please correct me if I'm wrong, but I believe the query now might take up to 2 minutes.
bo.InitialInterval = dnsTimeout | ||
bo.MaxInterval = 2 * bo.InitialInterval | ||
bo.MaxElapsedTime = 4 * bo.InitialInterval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sets the max. wait time to 2 minutes if I invoke lego with --dns-timeout=30
, doesn't it? Should bo.InitialInterval
be dnsTimeout/4
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial interval must be the same as the timeout use by the DNS client.
The dnsTimeout
cannot be used as MaxElapsedTime because it can be too short with the current value.
Currently dnsTimeout
is the timeout for one DNS query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial interval must be the same as the timeout use by the DNS client
I see. Then the documentation of the --dns-timeout
flag needs to be modified to something like "this is the minimum timeout given to resolve SOA queries; on network errors the actual timeout might be considerably longer".
Alternatively, we can reduce the default timeout down to something like 2 or 3s to approximate the previous behaviour (on that note, this should work in most cases, because DNS usually resolves fast enough). To accommodate #1008, we could also bump bo.MaxInterval
and bo.MaxElapsedTime
to 10*initialInterval
and 20*initialInterval
respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can reduce the default timeout.
But I think that multiply by 10 or 20 is too big: the dnsQuery is used by several functions in a loop (by example a loop on ns) and this can become very very long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I've missed the context; I thought this was part of the fetchSoaByFqdn
methods. Sorry for the confusion on my part.
bo.InitialInterval = dnsTimeout | ||
bo.MaxInterval = 2 * bo.InitialInterval | ||
bo.MaxElapsedTime = 4 * bo.InitialInterval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I've missed the context; I thought this was part of the fetchSoaByFqdn
methods. Sorry for the confusion on my part.
I am wondering about the impact and side effects of this PR. I need to take a little more time to think. |
Fixes #1008