Add certbot-dns-rfc2136 integration testing#8448
Conversation
fixes py2 crash
|
I will group your three first questions since they concern globally the same topic. About 1-2-3. Your general design is already pretty much set up in my opinion, and could even go without any major modification as such if parallelization is not required. And basically if there only one test that consume the new fixture, or we explicitly run tests on one unique worker, then parallelization is effectively not required. Just to say that without going further, if you want to integrate things progressively, we could just define a specific tox target, and set the parallelization off for that target. Then we can have one or several tests in the module, with a TODO linking to an adhoc issue to handle the parallelization problem. Now to handle things properly in a parallelized environment. In general in my opinion it is better to have integration assets set up on demand for each test, like nginx. However Boulder and Pebble are set up globally, mostly because Boulder is damn slow to start and consume a lot of resources, and I put Pebble on the same approach to not put too much complexity in the code. Thing is in this case, since the DNS server needs to be instructed to Boulder/Pebble, then the DNS server also needs to be set up globally. I think we can have a way with multiple test workers. The key point in my opinion is to dynamically configure bind9. Given that the DNS server would be global, it needs to be done in the
Then the Bind9 instance needs to fullfill the role of the standard DNS mock server, and the functionalities required to run the At that point the Bind9 will cover the DNS mock server functionality, and we have dedicated NB: To give more context about |
certbot-ci/certbot_integration_tests/assets/bind-config/conf/named.conf
Outdated
Show resolved
Hide resolved
|
Thanks to both of you for working on this! At a high level, I personally would suggest simplicity and iteration here. We can get something basic working and add more complexity/features later as needed. I don't think we really need things like parallelism and the ability to test with Boulder in the first version and I think it's likely we won't ever need them. As for what to test, I have the same idea and would personally just be satisfied with a simple test that successfully obtains a certificate. We could add some extra checks if you like, but again, I'd keep it pretty simple to start with. |
|
Yes for the iterative approach I would recommend what I described in the two first paragraphs of my big comment. The key points are to isolate these tests at runtime from the other integration tests (with a dedicated tox target) since it would break them if a bind9 server is used as the dns resolver for pebble without proper configuration, and to set explicitly no parallelization. |
- conftest: make DNS server a global resource - conftest: add dns_xdist parameter into node config - conftest: add --dns-server=bind flag - conftest: if configured, point the ACME server to the DNS server - dnsserver: make it sort-of compatible with xdist (future-proofing) - context: parameterize dns-rfc2136 credentials file (future proofing) - context: reduce dns-rfc2136 propagation time to speed up tests - tox: add a integration-dns-rfc2136 target - rfc2136: add a test/zone for subdelegation - rfc2136: skip tests if no DNS server is configured
|
Thanks as always for the very deep explanations. I've tried to balance simplicity and future proofing. The DNS Server is now a global resource for the integration tests, but only used for a single tox target I have further proposed in 3146ea8 a method to mock recursive DNS that I believe should be able to eventually meet the rather significant needs of #7244. That said, I admit that I did not understand @adferrand's idea to try integrate |
|
@alexzorin I will start the review soon, but from a quick look it looks great. I think the balance is good, and the PR stays in an acceptable size (at least for me, concerning things about To precise my idea on My idea was to build on the idea to never allow the DNS server to leak requests on the outside world. But instead of just making a configuration that would make Bind9 respond Or course I am pretty sure that Bind9 could be configured to do this mocking by itself, but having Of course this is entirely optional. We could just keep independant runs of |
adferrand
left a comment
There was a problem hiding this comment.
Thanks a lot for this really elegant PR. It fits very well in the general certbot-ci approach.
certbot-ci/certbot_integration_tests/assets/bind-config/rfc2136-credentials-default.ini
Show resolved
Hide resolved
- dns_server: rename rfc2136 creds file to .tpl - dns_server: dont vary dns server port, instead we will vary zone names (certbot#8455) - dns_server: log error if bind9 fails to stop cleanly - dns_server: replace assert with raise - context: remove redundant _worker_id - context: remove redundant cleanup override - context: fix seek/flush in credentials context manager - context: rename skip_if_no_server -> ...bind_server - context: add newline EOF
|
I've thought about the Having one standard stack of services for tests would be nice for I have a minor concern about making sure that there is good hygiene: making sure that mock RRs from Is there an obvious benefit to implementing that now? I can try it out if so. Otherwise, I'm pretty happy with how things have turned out. |
|
No, no immediate benefit, and this can be addressed in a later time like for parallelization. About one unified stack, I would like so, but bind9 is still an external dependency that needs to be available in the system to make these tests work, and it may be a problem for Windows. |
|
@alexzorin I tried the integration tests on a DigitalOcean Ubuntu 20.04 machine and WSL2 under Windows. Both failed with the following stack: Do you know what is going on? |
|
Ignoring the simple 9733524 mistake, I wonder if it's because Docker is taking a long time to download the image. I will try reproduce on DigitalOcean as well. |
|
Yes, I suspect that. I retried two times with the same result, but given that pytest will clear everything, it is very likely that the docker process is stopped before the big layer that is blocking here is downloaded, so it will be downloaded again. |
|
While you have the droplet up, could you time this: |
| print("BIND9 did not stop cleanly: {}".format(e), file=sys.stderr) | ||
| pass | ||
| finally: | ||
| shutil.rmtree(self.bind_root) |
There was a problem hiding this comment.
This call should also be protected by try/except to avoid to polute the logs when the temporary workspace has not even been setup before the main exception in the process occurs.
certbot-ci/certbot_integration_tests/rfc2136_tests/test_main.py
Outdated
Show resolved
Hide resolved
|
|
Ah, that's with the image is already downloaded. Perhaps after I am trying the same thing in NYC1 region at the moment. |
|
Well, with prune: But without prune, the image was already there so the docker should run almost instantly on the second |
|
Hmmm. It worked for me on the first go on DigitalOcean NYC1, Ubuntu 20.04. Installed Are you doing anything different? |
|
I put aside WSL 2, since I know that Docker for Windows has some problems to work correctly in that context (one of the main reason why I designed For DigitalOcean, I have almost the same setup, but installed Docker with snap, and my droplet is in Amsterdam. I retry with the non-snap installation. |
I confirm it is working with a non-snap installation of docker. |
|
OK. If I toggle which I guess makes sense because Docker is not a classic snap? Well, I don't know how Docker works on snap, but it is not surprising that the volume mounts are not working as expected anyhow. Would you like this to work for the PR to land? The fix is potentially simple according to https://github.com/docker-snap/docker-snap#usage:
At the moment |
|
Ah yes! Several points in favor of using a temporary workspace in the home user:
You do not even need to copy the data. Everything is refered relatively to If it is that simple, I think it is worth a try. |
|
Yuck, we would have to end up sniffing for snapped Docker anyway because |
|
Yes, I agree. |
|
Also the |
|
Thanks a lot for tackling this! |
* Implements support for ECDSA keys. Fixes certbot#2163. Thanks to @pahrohfit and @Tomoyuki-GH for previous efforts to implement suport for this. Co-Authored-By: Robert Dailey <rob@wargam.es> Co-Authored-By: Tomoyuki-GH <55397638+Tomoyuki-GH@users.noreply.github.com> * Handle unexpected key type migration. (certbot#8435) Fixes certbot#8365 This PR adds a control when `certbot certonly` or `certbot run` are called for a certificate that already exists and would eventually be replaced. As described in certbot#8365, this control is here to ensure that the user will not modify the key type of their certificate (eg. ECDSA to RSA) without an explicit approval (set explicitly `--cert-name` and `--key-type`), since RSA is the default if not specified. * Handle unexpected key type migration. * Update certbot-ci/certbot_integration_tests/certbot_tests/test_main.py Co-authored-by: Brad Warren <bmw@users.noreply.github.com> * Add certbot renew --key-type test (certbot#8447) * Test certbot renew --key-type * Fix typo * Use better asserts. Added notes to style guide. (certbot#8451) * Add --dns-server option in run_acme_server (certbot#7722) Fixes certbot#7717 This PR adds a `--dns-server` option to the `run_acme_server` test tool, in order to provide an arbitrary DNS server to Pebble or Boulder for the integration tests. I also take this occasion to make `run_acme_server` a real CLI tool using argparse, and set the `--server-type` (default `pebble`) option as well. * Set --dns-server flag in run_acme_server * Default to pebble * Add documentation * Configure also Boulder * cli: improve Obtaining/Renewing wording (certbot#8395) * cli: improve Obtaining/Renewing wording * dont use logger, and use new phrasing * .display_util.notify: dont wrap As this function is supposed to be an analogue for print, we do not want it to wrap by default. * Add certbot-dns-rfc2136 integration testing (certbot#8448) * tests: add certbot-dns-rfc2136 integration tests * dont use 'with' form of socket.socket fixes py2 crash * address some feedback: - conftest: make DNS server a global resource - conftest: add dns_xdist parameter into node config - conftest: add --dns-server=bind flag - conftest: if configured, point the ACME server to the DNS server - dnsserver: make it sort-of compatible with xdist (future-proofing) - context: parameterize dns-rfc2136 credentials file (future proofing) - context: reduce dns-rfc2136 propagation time to speed up tests - tox: add a integration-dns-rfc2136 target - rfc2136: add a test/zone for subdelegation - rfc2136: skip tests if no DNS server is configured * try add integration-dns-rfc2136 to CI * mock recursive dns via RPZ * update --dns-server args and tox.ini args * address more feedback: - dns_server: rename rfc2136 creds file to .tpl - dns_server: dont vary dns server port, instead we will vary zone names (certbot#8455) - dns_server: log error if bind9 fails to stop cleanly - dns_server: replace assert with raise - context: remove redundant _worker_id - context: remove redundant cleanup override - context: fix seek/flush in credentials context manager - context: rename skip_if_no_server -> ...bind_server - context: add newline EOF * conftest: document _setup_primary_node sideeffects * ci: rfc2136-integration from standard->nightly * fix _stop_bind (function was renamed to stop) * ignore errors from shutil.rmtree during cleanup * dns_server: check for crash while polling * remove --dry-run from rfc2136 test * import print_function * certbot-ci: fix py2 crash in dns_server * Read files as binary in crypto_util for crypto.load_certificate. (certbot#8371) * Flesh out ECDSA documentation (certbot#8464) * Changelog tweaks. * Add ECDSA documentation * Fix typo * Add Python 3.9 support and tests (certbot#8460) Fixes certbot#8134. * Test on Python 3.9. * Mention Python 3.9 support in changelog. * s/\( *'Pro.*3\.\)8\(',\)/\18\2\n\19\2/ * undo changes to tox.ini * Move more tests to Python 3.9 * Update PyYAML and packages which pinned it back * Upgrade typed-ast * Use <= to "pin" dnspython * Fix lint by telling pylint it cannot be trusted * Disable mypy on RFC plugin * add comment about <= support * Fix link typo in README (certbot#8476) * nginx: fix Unicode crash on Python 2 (certbot#8480) * nginx: fix py2 unicode sandwich The nginx parser would crash when saving configuraitons containing Unicode, because py2's `str` type does not support Unicode. This change fixes that crash by ensuring that a string type supporting Unicode is used in both Python 2 and Python 3. * nginx: add unicode to the integration test config * update CHANGELOG * Update changelog for 1.10.0 release * Release 1.10.0 * Add contents to certbot/CHANGELOG.md for next version * Bump version to 1.11.0 * Fix changelog typo (certbot#8488) * fix changelog typo * remove empty entry * Deprecate certbot-auto and remove tests * Completely deprecate certbot-auto * DeaDeactivate centos6/oraclelinux6 tests * Remove tests assets * Remove another test * Revert "Remove tests assets" This reverts commit e603afe. * Undo certbot-auto changes and remove centos6 tests * Don't deprecate certbot-auto quite yet * Remove centos6 test farm tests * undo changes to test farm test scripts * Deprecate certbot-auto and remove tests * Completely deprecate certbot-auto * DeaDeactivate centos6/oraclelinux6 tests * Remove tests assets * Remove another test * Revert "Remove tests assets" This reverts commit e603afe. (cherry picked from commit ff3a07d) * Undo certbot-auto changes and remove centos6 tests * Don't deprecate certbot-auto quite yet * Remove centos6 test farm tests * undo changes to test farm test scripts (cherry picked from commit e5113d5) * Fix add deprecated argument (certbot#8500) Fixes certbot#8495. To further explain the problem here, `modify_kwargs_for_default_detection` as called in `add` is simplistic and doesn't always work. See certbot#6164 for one other example. In this case, were bitten by the code https://github.com/certbot/certbot/blob/d1e7404358c05734aaf436ef3c9d709029d62b09/certbot/certbot/_internal/cli/helpful.py#L393-L395 The action used for deprecated arguments isn't in `ZERO_ARG_ACTIONS` so it assumes that all deprecated flags take one parameter. Rather than trying to fix this function (which I think can only realistically be fixed by certbot#4493), I took the approach that was previously used in `HelpfulArgumentParser.add_deprecated_argument` of bypassing this extra logic entirely. I adapted that function to now call `HelpfulArgumentParser.add` as well for consistency and to make testing easier. * Rename deprecated arg action class * Skip extra parsing for deprecated arguments * Add back test of --manual-public-ip-logging-ok * Add changelog entry * Fix changelog typo (certbot#8497) Co-authored-by: Adrien Ferrand <ferrand.ad@gmail.com> * Fix add deprecated argument (certbot#8500) (certbot#8501) Fixes certbot#8495. To further explain the problem here, `modify_kwargs_for_default_detection` as called in `add` is simplistic and doesn't always work. See certbot#6164 for one other example. In this case, were bitten by the code https://github.com/certbot/certbot/blob/d1e7404358c05734aaf436ef3c9d709029d62b09/certbot/certbot/_internal/cli/helpful.py#L393-L395 The action used for deprecated arguments isn't in `ZERO_ARG_ACTIONS` so it assumes that all deprecated flags take one parameter. Rather than trying to fix this function (which I think can only realistically be fixed by certbot#4493), I took the approach that was previously used in `HelpfulArgumentParser.add_deprecated_argument` of bypassing this extra logic entirely. I adapted that function to now call `HelpfulArgumentParser.add` as well for consistency and to make testing easier. * Rename deprecated arg action class * Skip extra parsing for deprecated arguments * Add back test of --manual-public-ip-logging-ok * Add changelog entry (cherry picked from commit 5f73274) * Update changelog for 1.10.1 release * Release 1.10.1 * Add contents to certbot/CHANGELOG.md for next version * Bump version to 1.11.0 * cli: clean up `certbot renew` summary (certbot#8503) * cli: clean up `certbot renew` summary - Unduplicate output which was being sent to both stdout and stderr - Don't use IDisplay.notification to buffer output - Remove big "DRY RUN" guards above and below, instead change language to "renewal" or "simulated renewal" - Reword "Attempting to renew cert ... produced an unexpected error" to be more concise. * add newline to docstring Co-authored-by: ohemorange <ebportnoy@gmail.com> Co-authored-by: ohemorange <ebportnoy@gmail.com> * Update both main VA and remote VA to use the provided DNS server (certbot#8467) * dns-google: improve credentials error message (certbot#8482) This adds a 'Error parsing credentials file ...' wrapper to any errors raised inside certbot-dns-google's usage of oauth2client, to make it obvious to the user where the problem lies. * Removed some unused imports. (certbot#8424) These were not annotated as something that should be ignored, and the test-suite passes with these changes. * snap: disable the "user site-packages directory" (certbot#8509) Although Certbot is a classic snap, it shouldn't load Python code from the host system. This change prevents packages being loaded from the "user site-packages directory" (PEP-370). i.e. Certbot will no longer load DNS plugins installed via `pip install --user certbot-dns-*`. * add coverage testing to dns-rfc2136 integration (certbot#8469) * add coverage testing to dns-rfc2136 integration * add coverage rule for certbot/* as well * Completely deprecate certbot-auto (certbot#8489) Fixes certbot#8296 * Completely deprecate certbot-auto * Add changelog * Deprecate support for Python 2 (certbot#8491) Fixes certbot#8388 * Deprecate support for Python 2 * Ignore deprecation warning * Update certbot/CHANGELOG.md Co-authored-by: Brad Warren <bmw@users.noreply.github.com> * Add reminders to update documentation (certbot#8518) * Add documentation PR checklist item. * Update contributing doc * Avoid --system-site-packages during the snap build by preparing a venv with pipstrap that already includes wheel (certbot#8445) This PR proposes an alternative configuration for the snap build that avoid the need to use `--system-site-package` when constructing the virtual environment in the snap. The rationale of `--system-site-package` was that by default, snapcraft creates a virtual environment without `wheel` installed in it. However we need it to build the wheels like `cryptography` on ARM architectures. Sadly there is not way to instruct snapcraft to install some build dependencies in the virtual environment before it kicks in the build phase itself, without overriding that entire phase (which is possible with `parts.override-build`). The alternative proposed here is to not override the entire build part, but just add some preparatory steps that will be done before the main actions handled by the `python` snap plugin. To do so, I take advantage of the `--upgrade` flag available for the `venv` module in Python 3. This allows to reuse a preexisting virtual environment, and upgrade its component. Adding a flag to the `venv` call is possible in snapcraft, thanks to the `SNAPCRAFT_PYTHON_VENV_ARGS` environment variable (and it is already used to set the `--system-site-package`). Given `SNAPCRAFT_PYTHON_VENV_ARGS` set to `--upgrade` , we configure the build phase as follows: * create the virtual environment ourselves in the expected place (`SNAPCRAFT_PART_INSTALL`) * leverage `tools/pipstrap.py` to install `setuptools`, `pip`, and of course, `wheel` * let the standard build operations kick in with a call to `snapcraftctl build`: at that point the `--upgrade` flag will be appended to the standard virtual environment creation, reusing our crafted venv instead of creating a new one. This approach has also the advantage to invoke `pipstrap.py` as it is done for the other deployable artifacts, and for the PR validations, reducing risks of shifts between the various deployment methods. * Deprecate support of Apache 2.2 in certbot-apache (certbot#8516) Fixes certbot#8462 * Deprecate support of Apache 2.2 in certbot-apache * Add a changelog * Add finish_release flags and CLI parsing (certbot#8522) * Setup a timeout to the remote snap build process (certbot#8484) This PR adds a `--timeout` flag to `tools/snap/build_remote.py` in order to fail the process if the time execution reaches the provided timeout. It is set to 5h30 on the relevant Azure job, while the job itself has a timeout of 6h managed on Azure side. This allows a slightly better output for these jobs when the snapcraft build stales for any reason. * add OS package warning (certbot#8533) * Make our test farm tests instances self-destruct (certbot#8536) * remove unused user data * have instance self-destruct in case cleanup fails * correct kwargs * fix param order * remove CentOS 6 cruft from test farm tests (certbot#8534) * Add path to certbot executable in debug log (certbot#8538) * Enable again build isolation with proper pinning of build dependencies (certbot#8443) Fixes certbot#8256 First let's sum up the problem to solve. We disabled the build isolation available in pip>=19 because it could potential break certbot build without a control on our side. Basically builds are not reproductible. Indeed the build isolation triggers build of PEP-517 enabled transitive dependencies (like `cryptography`) with the build dependencies defined in their `pyproject.toml`. For `cryptography` in particular these requirements include `setuptools>=40.6.0`, and quite logically pip will install the latest version of `setuptools` for the build. And when `setuptools` broke with the version 50, our build did the same. But disabling the build isolation is not a long term solution, as more and more project will migrate on this approach and it basically provides a lot of benefit in how dependencies are built. The ideal solution would be to be able to apply version constraints on our side on the build dependencies, in order to pin `setuptools` for instance, and decide precisely when we upgrade to a newer version. However for now pip does not provide a mechanism for that (like a `--build-constraint` flag or propagation of existing `--constraint` flag). Until I saw pypa/pip#9081 and pypa/pip#8439. Apart the fact that pypa/pip#9081 shows that pip maintainers are working on this issue, it explains how pip works regarding PEP-517 and infers which workaround can be used to still pin the build dependencies. It turns out that pip invokes itself in each build isolation to install the build dependencies. It means that even if some flags (like `--constraint`) are not explicitly passed to the pip sub call, the global environment remains, in particular the environment variables. Thus it is known that every pip flag can alternatively be set by environment variable using the following pattern for the variable name: `PIP_[FLAG_NAME_UPPERCASE]`. So for `--constraint`, it is `PIP_CONSTRAINT`. And so you can pass a constraint file to the pip sub call through that mechanism. I made some tests with a constraint file containing pinning for `setuptools`: indeed under isolation zone, the constraint file has been honored and the provided pinned version has been used to build the dependencies (I tested it with `cryptography`). Finally this PR takes advantage of this mechanism, by setting `PIP_CONSTRAINT` to `pip_install`, the snap building process, the Dockerfiles and the windows installer building process. I also extracted out the requirements of the new `pipstrap.py` to be reusable in these various build processes. * Use workaround to fix build requirements in build isolation, and renable build isolation * Clean imports in pipstrap * Externalize pipstrap reqs to be reusable * Inject pipstrap constraints during pip_install * Update docker build * Update snapcraft build * Prepare installer build * Fix pipstrap constraints in snap build * Add back --no-build-cache option in Docker images build * Update snap/snapcraft.yaml * Use proper flags with pip Co-authored-by: Brad Warren <bmw@users.noreply.github.com> * Added certbot-ci to lint section. Silenced and fixed linting warnings. (certbot#8450) * remove reference to letsencrypt(-auto) (certbot#8531) * Clean up certbot-auto docs (certbot#8532) Fixes certbot#8519. I left the `certbot-auto` docs in `install.rst` to avoid breaking links and to help propagate information about our changes there. I moved it closer to the bottom of the doc though since I think our documentation about OS packages and Docker is more helpful to most people. * clean up certbot-auto docs * add more info to changelog * remove more certbot-auto references Co-authored-by: Mads Jensen <mje@inducks.org> Co-authored-by: Robert Dailey <rob@wargam.es> Co-authored-by: Tomoyuki-GH <55397638+Tomoyuki-GH@users.noreply.github.com> Co-authored-by: Adrien Ferrand <adferrand@users.noreply.github.com> Co-authored-by: Brad Warren <bmw@users.noreply.github.com> Co-authored-by: alexzorin <alex@zorin.id.au> Co-authored-by: Brad Warren <bmw@eff.org> Co-authored-by: ohemorange <ebportnoy@gmail.com> Co-authored-by: Adrien Ferrand <ferrand.ad@gmail.com> Co-authored-by: alexzorin <alex@zor.io> Co-authored-by: osirisinferi <github@flut.demon.nl>
Fixes #8433.
This adds integration testing for the
certbot-dns-rfc2136plugin by providing an integration test context which runs an authoritative BIND server, preconfigured with zones and RFC2136 credentials.Do not merge until #8448 (comment) is addressed (moving test from PR to nightly)(now done)Everything below here is outdated:
It is very much a draft design, but on top of #7722 (+ adding
--dns-servertoconftest.py) it successfully issues a certificate using RFC2136 with e.g.:Some questions of my own:
conftest.pyand keep it running?xdiststuff. Is the current design broken if parallelism kicks in? Do I need to offset the network port by the worker? Is that what nginx does?{0}.{1}.wtfdomain generation pattern would fit in. ISTM the zones need to exist before anything starts up. If we were to do dynamic zones and automatically generate the BIND config, I'm not sure at what point in the program this would happen.--dns-serverfor Boulder might be a bit tricky. I left some comments innamed.confabout that. But maybe it's not worth testing multiple ACME servers.