Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature request: Don't throw exception on renewal failure, and continue the rest of the renewals #363

Open
Marcel0024 opened this issue Mar 26, 2020 · 15 comments

Comments

@Marcel0024
Copy link

Marcel0024 commented Mar 26, 2020

I have an web application (SaaS), that has a multiple bindings from different domains.

Now the problem is, sometimes clients change their DNS or don't longer use our service without telling us. So on certifcate renewal it will fail even if it's renewing like 6 at a time. If one fails it fails all of them.

Proposal 1:
'letsencryp:ThrowOnRenewalFailure' = default is true (as it is now)
This is used only by the Web job

if 'letsencryp:ThrowOnRenewalFailure' = false, behavior in Web Jobs:
IF a renewal fails, just skip it. Let's Encrypt will send us a reminder email anyway, than we will know it failed

Proposal 2:
Never throw on renewal failure, always continue

@Marcel0024 Marcel0024 changed the title New feature request: Don't throw on renewal failure, and continue in the foreach New feature request: Don't throw exception on renewal failure, and continue the rest of the renewals Apr 2, 2020
@Marcel0024
Copy link
Author

@sjkp Can i get feedback on this?

@ohadschn
Copy link
Contributor

ohadschn commented Apr 9, 2020

@Marcel0024 perhaps Simon's vNext project does this: https://github.com/sjkp/letsencrypt-azure

Also, you could check out the WebJob I wrote based on Simon's great work, which can do this with some extra config: (basically you'd define a cert per domain). A little bit of a hassle, but it can be achieved very easily by modifying the configuration script.

@Marcel0024
Copy link
Author

Marcel0024 commented Apr 10, 2020

@ohadschn the first project you mention, won't work for me, because it uses Azure DNS. I don't have control of any of the domains i serve. Our external clients point their domain to my app. Than i request a certificate for their domain.

Grouping them into a SAN certificate is not ideal either, my problem is that some clients randomly change their DNS, after deciding not to use our services. Than requesting a SAN certificate would also fail.

This extension works fine, my only problem is that when 1 domain fails during renewal, the rest won't renew either.
So it should just skip if 1 renewal fails and not all of them.

I submitted a PR but i would like to get feedback from @sjkp , to see if i need to change anything or what's the ideal solution/behavior

@ohadschn
Copy link
Contributor

ohadschn commented Apr 10, 2020

@Marcel0024 didn't realize you submitted a PR, more power to you! I guess you can just build from source to unblock yourself...

But just to clarify, using the WebJob I built, I suggested the exact opposite of grouping the host names into a SAN (which is what this extension, and my WebJob by default, do anyway).

Say you have 5 host names. In my WebJob you could define each of them to have its own separate cert. Then if say 2 of them fail, the other 3 will be unaffected.

@Marcel0024
Copy link
Author

@ohadschn Ah i see, i'll take a look

@InteXX
Copy link

InteXX commented Jul 20, 2020

@ohadschn

"I guess you can just build from source to unblock yourself"

Could you elaborate a bit on this? I've filed this issue, but it seems we haven't heard from @sjkp in a little bit. He's likely tied up on other important projects, so I don't want to press him.

I'm prepared to git sync the repo, make the necessary changes and build, but I'm at a loss as to how to deploy from there.

Also: will deploying my own build 'break the chain,' so to speak, for updates to future versions? I imagine so, but I'd like to be certain.

I just received my 20-day notice on these certs, so there's some urgency with this one.

@ohadschn
Copy link
Contributor

@InteXX I'm not sure what you're asking - do you want to resolve the issue you linked to (NullReferenceException), or do you want to resolve this issue (continue the rest of the renewals on failure)?

At any rate, I never built a Web App extension so I don't know to tell you off the top of my head how you could replicate this extension's behavior from you own custom build. There's probably a way to publish you custom extension, or sideload it, need to read the docs. Another option is just installing the WebJob and setting its configuration manually. As for future versions, yes if @sjkp releases a new version you would have to override your own custom one, but on the flip side that new version might include the fix (by merging the PR above submitted by @Marcel0024).

Beyond that, both issues might be resolved/worked around by using the WebJob I built on top of this extension (can't be sure since I don't know the underlying cause), so you might want to try that if you don't get further support on the existing issues: https://github.com/ohadschn/letsencrypt-webapp-renewer

@InteXX
Copy link

InteXX commented Jul 21, 2020

@ohadschn

I like the idea of continuing the rest of the renewals on failure. It's an important enhancement and I hope it makes the cut.

My immediate concern, though, is the AcmeRequestException I'm getting (the second of the two listed there). Once I get past that I'll be able to renew these certs and continue with less pressing affairs.

The thrown exception—accountDoesNotExist: No account exists with the provided key—may actually be legit. I've used the connection string for the storage account as provided in the Azure portal, so it should work. I don't know.

I think my first move should be to clone the repo and have a closer look.

Failing any success there, I think my next move should be to have a look at your solution. Thanks for the link.

@InteXX
Copy link

InteXX commented Jul 21, 2020

@ohadschn

I presume your solution uses the new LetsEncrypt v2?

@InteXX
Copy link

InteXX commented Jul 21, 2020

@ohadschn

I just had a thought... I wonder if the problem is that my DNS is hosted at GoDaddy.

@ohadschn
Copy link
Contributor

ohadschn commented Jul 21, 2020

@InteXX

  1. My solution will retry failures per certificate (so if you want the behavior specified in this issue you'd define each host to have its own separate cert).
  2. AcmeRequestException should have nothing to do with Azure storage account connection strings. It's the ACME account, only identified by your e-mail and/or your hostname if memory serves. At any rate, my solution doesn't use Azure Storage.
  3. Yes my solution uses LetsEncrypt v2 (since it's based on the last version of this extension which uses the same).
  4. Assuming you're using the latest version of this extension, it doesn't support the ACME DNS challenge at all, which means your DNS provider should be irrelevant.

@InteXX
Copy link

InteXX commented Jul 21, 2020

@ohadschn

Perfect. All of it.

I'm going to try yours first, before diving into the code.

@InteXX
Copy link

InteXX commented Jul 21, 2020

@ohadschn

I'm seeing a potential problem. I glanced through the source and nothing jumped out at me either way.

I see here in your documentation that you require the web app (App Service) name to be prepended to the actual value and delimited with a hyphen, i.e. letsencrypt:%webAppName%-subscriptionId.

My chosen naming convention for my app services also includes a hyphen, e.g. the service bar in the resource group foo will be named foo-bar.

Does your parsing code consider this? Will that extra hyphen in my app service name cause it to fail?

@ohadschn
Copy link
Contributor

ohadschn commented Jul 21, 2020

@InteXX shouldn't be an issue, because it works the other way around - first I'm noting down all available web apps from the letsencrypt:webApps config, then for each such web app (say foo-bar) I'm looking for an option that looks like letsencrypt:foo-bar-subscriptionId. You can see the code here.

If you have any further questions about the WebJob I built on top of this (great) extension, please open an issue on the letsencrypt-webapp-renewer repository and we can discuss there - we hijacked this issue sufficiently :)

@InteXX
Copy link

InteXX commented Jul 21, 2020

Very well. Fair 'nuff :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants