Retry errors with subproblems in obtain_certificate with --allow-subset-of-names (#9251)#9272
Conversation
|
Hey @bmw, do you have any advice RE the checks I suggested in the PR description? We should definitely make sure a retry is likely to succeed before retrying, but I don't know exactly what errors are going to be recoverable (or if I should just stick to CAA errors). Currently this rescues any ACME error, and retries with the domains minus the identifiers of any subproblems that might exist, which is almost certainly too general. |
|
Hey @JamesBalazs. Sorry for the delayed response here. I've been busy. Three high level thoughts:
It sounds like you think (2) is too aggressive though. Why is that? What types of problems are you worried about here? I certainly could be overlooking something. |
|
Hey @bmw, in response to your points:
Thanks for taking a look at this! |
|
…ow_subset_of_names Only retry if not all domains succeeded
5bda47c to
2484239
Compare
|
@bmw I've implemented the change we discussed along with a simple test for the behaviour in |
|
I'll review this soon :) ! |
alexzorin
left a comment
There was a problem hiding this comment.
I am super sorry about leaving this for so long. Here is a further review.
alexzorin
left a comment
There was a problem hiding this comment.
This is looking very close. The actual behavior seems to be exactly right now.
With the unit tests, I've provided a lot of feedback, but only because I know that trying to make sense of how they work is super daunting.
alexzorin
left a comment
There was a problem hiding this comment.
OK, just some small nits, but this looks great. Thanks for sticking with it!
alexzorin
left a comment
There was a problem hiding this comment.
LGTM, thanks. Sorry we missed the release by half a day 😆 .
This is an attempt at retrying when subproblems are present on the error raised by
obtain_certificate.According to the spec:
so here we can use
acme.Identifier.values present in the subproblems, to remove failed domains from the call toobtain_certificate, to retry.The first pass at this contains little in the way of checks to ensure the ACME error is handled correctly. I'm assuming we might want to check:
acme.Identifier.typeis anIDENTIFIER_FQDNorIDENTIFIER_IPBut I would like some feedback on what errors we expect to be worth retrying, and which checks are necessary to ensure it is safe to retry (so we're not just wasting retry attempts that may be limited by a quota in LetsEncrypt).
Pull Request Checklist
mastersection ofcertbot/CHANGELOG.mdto include a description of the change being made.AUTHORS.mdif you like.