Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: retry on DNS queries. #1180

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
34 changes: 30 additions & 4 deletions challenge/dns01/nameserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ import (
"sync"
"time"

"github.com/cenkalti/backoff/v4"
"github.com/go-acme/lego/v3/log"
"github.com/miekg/dns"
)

Expand Down Expand Up @@ -229,15 +231,39 @@ func dnsQuery(fqdn string, rtype uint16, nameservers []string, recursive bool) (
m := createDNSMsg(fqdn, rtype, recursive)

var in *dns.Msg
var err error
var errG error
ldez marked this conversation as resolved.
Show resolved Hide resolved

for _, ns := range nameservers {
in, err = sendDNSQuery(m, ns)
if err == nil && len(in.Answer) > 0 {
bo := backoff.NewExponentialBackOff()
bo.Multiplier = 1.2
bo.InitialInterval = dnsTimeout
bo.MaxInterval = 2 * bo.InitialInterval
bo.MaxElapsedTime = 4 * bo.InitialInterval
Comment on lines +239 to +241
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets the max. wait time to 2 minutes if I invoke lego with --dns-timeout=30, doesn't it? Should bo.InitialInterval be dnsTimeout/4 instead?

Copy link
Member Author

@ldez ldez Jun 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial interval must be the same as the timeout use by the DNS client.

The dnsTimeout cannot be used as MaxElapsedTime because it can be too short with the current value.

Currently dnsTimeout is the timeout for one DNS query.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial interval must be the same as the timeout use by the DNS client

I see. Then the documentation of the --dns-timeout flag needs to be modified to something like "this is the minimum timeout given to resolve SOA queries; on network errors the actual timeout might be considerably longer".

Alternatively, we can reduce the default timeout down to something like 2 or 3s to approximate the previous behaviour (on that note, this should work in most cases, because DNS usually resolves fast enough). To accommodate #1008, we could also bump bo.MaxInterval and bo.MaxElapsedTime to 10*initialInterval and 20*initialInterval respectively.

Copy link
Member Author

@ldez ldez Jun 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can reduce the default timeout.

But I think that multiply by 10 or 20 is too big: the dnsQuery is used by several functions in a loop (by example a loop on ns) and this can become very very long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I've missed the context; I thought this was part of the fetchSoaByFqdn methods. Sorry for the confusion on my part.


operation := func() error {
var err error
in, err = sendDNSQuery(m, ns)

// errors from miekg/dns package and some errors from the net package must stop the retry.
var e *dns.Error
if err != nil &&
(strings.Contains(err.Error(), "connection refused") || errors.As(err, &e)) {
return backoff.Permanent(err)
}

return err
}

notify := func(err error, d time.Duration) {
log.Infof("dnsQuery retry %v: fqdn=%s, ns=%s: %v", d, fqdn, ns, err)
}

errG = backoff.RetryNotify(operation, bo, notify)
if errG == nil && len(in.Answer) > 0 {
break
}
}
return in, err
return in, errG
}

func createDNSMsg(fqdn string, rtype uint16, recursive bool) *dns.Msg {
Expand Down