Skip to content

Add certbot-dns-rfc2136 integration testing#8448

Merged
adferrand merged 15 commits intocertbot:masterfrom
alexzorin:dns-rfc2136-integration-tests
Nov 17, 2020
Merged

Add certbot-dns-rfc2136 integration testing#8448
adferrand merged 15 commits intocertbot:masterfrom
alexzorin:dns-rfc2136-integration-tests

Conversation

@alexzorin
Copy link
Copy Markdown
Collaborator

@alexzorin alexzorin commented Nov 12, 2020

Fixes #8433.

This adds integration testing for the certbot-dns-rfc2136 plugin by providing an integration test context which runs an authoritative BIND server, preconfigured with zones and RFC2136 credentials.

Do not merge until #8448 (comment) is addressed (moving test from PR to nightly) (now done)


Everything below here is outdated:

It is very much a draft design, but on top of #7722 (+ adding --dns-server to conftest.py) it successfully issues a certificate using RFC2136 with e.g.:

pytest certbot-ci/certbot_integration_tests/rfc2136_tests --acme-server=pebble --dns-server=127.0.0.1:45953

Some questions of my own:

  • At the moment, it's a new BIND server for every test. Does it make sense to instead provide the BIND instance globally in conftest.py and keep it running?
  • I'm not up to speed on this xdist stuff. Is the current design broken if parallelism kicks in? Do I need to offset the network port by the worker? Is that what nginx does?
  • I'm not sure how the {0}.{1}.wtf domain generation pattern would fit in. ISTM the zones need to exist before anything starts up. If we were to do dynamic zones and automatically generate the BIND config, I'm not sure at what point in the program this would happen.
  • --dns-server for Boulder might be a bit tricky. I left some comments in named.conf about that. But maybe it's not worth testing multiple ACME servers.
  • What type of tests did we have in mind to begin with? e.g. Making sure Certbot works with different tsig algorithms. Making sure Certbot chooses the right zone when there's subdelegation. Making sure Certbot fully cleans up records if domain authorization fails.

@adferrand
Copy link
Copy Markdown
Collaborator

adferrand commented Nov 12, 2020

I will group your three first questions since they concern globally the same topic.

About 1-2-3.

Your general design is already pretty much set up in my opinion, and could even go without any major modification as such if parallelization is not required. And basically if there only one test that consume the new fixture, or we explicitly run tests on one unique worker, then parallelization is effectively not required.

Just to say that without going further, if you want to integrate things progressively, we could just define a specific tox target, and set the parallelization off for that target. Then we can have one or several tests in the module, with a TODO linking to an adhoc issue to handle the parallelization problem.

Now to handle things properly in a parallelized environment. In general in my opinion it is better to have integration assets set up on demand for each test, like nginx. However Boulder and Pebble are set up globally, mostly because Boulder is damn slow to start and consume a lot of resources, and I put Pebble on the same approach to not put too much complexity in the code. Thing is in this case, since the DNS server needs to be instructed to Boulder/Pebble, then the DNS server also needs to be set up globally.

I think we can have a way with multiple test workers. The key point in my opinion is to dynamically configure bind9. Given that the DNS server would be global, it needs to be done in the _setup_primary_node of conftest.py. Ideally we should design this so it colocates nicely with the other integration tests (certbot and nginx). I think it requires the following:

  • having a global flag in pytest_addoption to select as DNS server a crafted Bind9 instance or the internal mock DNS server
  • depending on the global flag, we either setup Bind9 or leave the default behavior (internal DNS server mock)
  • if Bind9 is used, configure the --dns-server to the run_acme_server function

Then the Bind9 instance needs to fullfill the role of the standard DNS mock server, and the functionalities required to run the certbot-dns-rfc2136 integration tests. I think it can be done by defining (dynamically) a zone for each {0}.wtf domain, each one having a zone configuration returning localhost for every subdomain request.

At that point the Bind9 will cover the DNS mock server functionality, and we have dedicated {0}.wtf zones to work with certbot-dns-rfc2136 integration tests.

NB: To give more context about {0}.wtf, it is to solve the problem of having multiple certbot instances in parallel, and how the CA server will redirect its challenge requests check for HTTP challenges. The solution I found is to have a L4 proxy (HTTP proxy) between the CA server and the certbot instances. Given the actual host the CA server wants to check, this proxy will redirect the request to the proper test worker, and so the proper certbot instance.

@adferrand
Copy link
Copy Markdown
Collaborator

On your last point, I agree with you in term of functional coverage to begin with. Concerning subdomain delegation, it would be great to cover #7244 and #7346, as it could also unblock these features.

@bmw
Copy link
Copy Markdown
Member

bmw commented Nov 12, 2020

Thanks to both of you for working on this!

At a high level, I personally would suggest simplicity and iteration here. We can get something basic working and add more complexity/features later as needed. I don't think we really need things like parallelism and the ability to test with Boulder in the first version and I think it's likely we won't ever need them.

As for what to test, I have the same idea and would personally just be satisfied with a simple test that successfully obtains a certificate. We could add some extra checks if you like, but again, I'd keep it pretty simple to start with.

@adferrand
Copy link
Copy Markdown
Collaborator

Yes for the iterative approach I would recommend what I described in the two first paragraphs of my big comment.

The key points are to isolate these tests at runtime from the other integration tests (with a dedicated tox target) since it would break them if a bind9 server is used as the dns resolver for pebble without proper configuration, and to set explicitly no parallelization.

@adferrand adferrand self-assigned this Nov 12, 2020
- conftest: make DNS server a global resource
- conftest: add dns_xdist parameter into node config
- conftest: add --dns-server=bind flag
- conftest: if configured, point the ACME server to the DNS server
- dnsserver: make it sort-of compatible with xdist (future-proofing)
- context: parameterize dns-rfc2136 credentials file (future proofing)
- context: reduce dns-rfc2136 propagation time to speed up tests
- tox: add a integration-dns-rfc2136 target
- rfc2136: add a test/zone for subdelegation
- rfc2136: skip tests if no DNS server is configured
@alexzorin alexzorin marked this pull request as ready for review November 13, 2020 07:36
@alexzorin
Copy link
Copy Markdown
Collaborator Author

Thanks as always for the very deep explanations. I've tried to balance simplicity and future proofing.

The DNS Server is now a global resource for the integration tests, but only used for a single tox target integration-dns-rfc2186. I've tried to give it all a shape that will allow it to be easily transitioned into full-blown xdist and {0}.{1}.wtf mode, should those needs arise. But I feel this looks pretty adequate in terms of features.

I have further proposed in 3146ea8 a method to mock recursive DNS that I believe should be able to eventually meet the rather significant needs of #7244. That said, I admit that I did not understand @adferrand's idea to try integrate challtestsrv as a forwarder. I would be keen to hear how it would work (e.g. following CNAMEs from AA=0 to AA=1) so we don't accidentally choose something which is a bad fit.

@adferrand
Copy link
Copy Markdown
Collaborator

adferrand commented Nov 13, 2020

@alexzorin I will start the review soon, but from a quick look it looks great. I think the balance is good, and the PR stays in an acceptable size (at least for me, concerning things about certbot-ci :p). We can definitly handle the parallelization problem later.

To precise my idea on challtestsrv, it needs to be challenged, but my intention was not that it could cover any need for the RFC2136 integration tests. For that I think that a carefully crafted and self-contained configuration in Bind9 is definitely better.

My idea was to build on the idea to never allow the DNS server to leak requests on the outside world. But instead of just making a configuration that would make Bind9 respond NXDOMAIN or SERVFAIL errors for unhandled zones, I thought that we could use the challtestsrv as the unique forwarder to Bind9. On top of ensuring that any DNS query would stay local, if I am not mistaken it would make Bind9 respond what is configured in challtestsrv for any other unhandled zone, making it effectively a DNS mock server for the rest of the integration tests (certbot or nginx).

Or course I am pretty sure that Bind9 could be configured to do this mocking by itself, but having challtestsrv could reduce the configuration to do on Bind9 on that matter, while keeping the convenient behavior we have when challtestsrv is used directly. To recall, challtestsrv is primarily designed to help with DNS challenges: it exposes a dead simple REST API to prepare and return the expected TXT requests during a DNS challenge, while fallbacking to a predefined IP for everything else.

Of course this is entirely optional. We could just keep independant runs of certbot+nginx integration tests vs rfc2136 tests. But with this I think we could have a transparent execution of all tests in a row. What do you think of this?

Copy link
Copy Markdown
Collaborator

@adferrand adferrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this really elegant PR. It fits very well in the general certbot-ci approach.

- dns_server: rename rfc2136 creds file to .tpl
- dns_server: dont vary dns server port, instead we will vary zone names (certbot#8455)
- dns_server: log error if bind9 fails to stop cleanly
- dns_server: replace assert with raise
- context: remove redundant _worker_id
- context: remove redundant cleanup override
- context: fix seek/flush in credentials context manager
- context: rename skip_if_no_server -> ...bind_server
- context: add newline EOF
@alexzorin
Copy link
Copy Markdown
Collaborator Author

I've thought about the pebble-challtestsrv forwarder question for a few days and I guess I don't have much an opinion either way.

Having one standard stack of services for tests would be nice for certbot-ci.

I have a minor concern about making sure that there is good hygiene: making sure that mock RRs from pebble-challtestsrv do not have accidental contamination of RFC2136 tests (using BIND9 mock RRs), and vice-versa. But that's probably quite doable just by virtue of {0}.{1}.wtf.

Is there an obvious benefit to implementing that now? I can try it out if so. Otherwise, I'm pretty happy with how things have turned out.

@adferrand
Copy link
Copy Markdown
Collaborator

No, no immediate benefit, and this can be addressed in a later time like for parallelization. About one unified stack, I would like so, but bind9 is still an external dependency that needs to be available in the system to make these tests work, and it may be a problem for Windows.

@adferrand
Copy link
Copy Markdown
Collaborator

adferrand commented Nov 16, 2020

@alexzorin I tried the integration tests on a DigitalOcean Ubuntu 20.04 machine and WSL2 under Windows. Both failed with the following stack:

$ tox -e integration-dns-rfc2136
...
integration-dns-rfc2136 run-test: commands[1] | pytest certbot-ci/certbot_integration_tests/rfc2136_tests --acme-server=pebble --dns-server=bind --numprocesses=1
DNS xdist config:
{'address': '127.0.0.1', 'port': 45953}
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/root/certbot/certbot-ci/certbot_integration_tests/utils/dns_server.py", line 93, in _start_bind
INTERNALERROR>     self._wait_until_ready()
INTERNALERROR>   File "/root/certbot/certbot-ci/certbot_integration_tests/utils/dns_server.py", line 127, in _wait_until_ready
INTERNALERROR>     raise ValueError(
INTERNALERROR> ValueError: Gave up waiting for DNS server ('127.0.0.1', 45953) to respond
INTERNALERROR>
INTERNALERROR> During handling of the above exception, another exception occurred:
INTERNALERROR>
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/main.py", line 107, in wrap_session
INTERNALERROR>     config._do_configure()
INTERNALERROR>   File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/config.py", line 935, in _do_configure
INTERNALERROR>     self.hook.pytest_configure.call_historic(kwargs=dict(config=self))
INTERNALERROR>   File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 750, in call_historic
INTERNALERROR>     self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
INTERNALERROR>   File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 339, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook, methods, kwargs)
INTERNALERROR>   File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 334, in <lambda>
INTERNALERROR>     _MultiCall(methods, kwargs, hook.spec_opts).execute()
INTERNALERROR>   File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 614, in execute
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/root/certbot/certbot-ci/certbot_integration_tests/conftest.py", line 41, in pytest_configure
INTERNALERROR>     _setup_primary_node(config)
INTERNALERROR>   File "/root/certbot/certbot-ci/certbot_integration_tests/conftest.py", line 109, in _setup_primary_node
INTERNALERROR>     dns_server.start()
INTERNALERROR>   File "/root/certbot/certbot-ci/certbot_integration_tests/utils/dns_server.py", line 54, in start
INTERNALERROR>     self._start_bind()
INTERNALERROR>   File "/root/certbot/certbot-ci/certbot_integration_tests/utils/dns_server.py", line 96, in _start_bind
INTERNALERROR>     self._stop_bind()
INTERNALERROR> AttributeError: 'DNSServer' object has no attribute '_stop_bind'
Traceback (most recent call last):
  File "/root/certbot/.tox/integration-dns-rfc2136/bin/pytest", line 8, in <module>
    sys.exit(main())
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/config.py", line 58, in main
    return config.hook.pytest_cmdline_main(config=config)
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 745, in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 339, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 334, in <lambda>
    _MultiCall(methods, kwargs, hook.spec_opts).execute()
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/vendored_packages/pluggy.py", line 614, in execute
    res = hook_impl.function(*args)
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/main.py", line 140, in pytest_cmdline_main
    return wrap_session(config, _main)
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/main.py", line 135, in wrap_session
    config._ensure_unconfigure()
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/site-packages/_pytest/config.py", line 944, in _ensure_unconfigure
    fin()
  File "/root/certbot/certbot-ci/certbot_integration_tests/utils/dns_server.py", line 68, in stop
    shutil.rmtree(self.bind_root)
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/shutil.py", line 706, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/root/certbot/.tox/integration-dns-rfc2136/lib/python3.8/shutil.py", line 704, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpj7qvk3hu'
ERROR: InvocationError for command /root/certbot/.tox/integration-dns-rfc2136/bin/pytest certbot-ci/certbot_integration_tests/rfc2136_tests --acme-server=pebble --dns-server=bind --numprocesses=1 (exited with code 1)
_______________________________________________________ summary ________________________________________________________
ERROR:   integration-dns-rfc2136: commands failed

Do you know what is going on?

@alexzorin
Copy link
Copy Markdown
Collaborator Author

Ignoring the simple 9733524 mistake, I wonder if it's because Docker is taking a long time to download the image.

I will try reproduce on DigitalOcean as well.

@adferrand
Copy link
Copy Markdown
Collaborator

Yes, I suspect that. I retried two times with the same result, but given that pytest will clear everything, it is very likely that the docker process is stopped before the big layer that is blocking here is downloaded, so it will be downloaded again.

@alexzorin
Copy link
Copy Markdown
Collaborator Author

While you have the droplet up, could you time this:

docker run --rm -it --entrypoint /bin/true internetsystemsconsortium/bind9:9.16

print("BIND9 did not stop cleanly: {}".format(e), file=sys.stderr)
pass
finally:
shutil.rmtree(self.bind_root)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call should also be protected by try/except to avoid to polute the logs when the temporary workspace has not even been setup before the main exception in the process occurs.

@adferrand
Copy link
Copy Markdown
Collaborator

adferrand commented Nov 16, 2020

While you have the droplet up, could you time this:

docker run --rm -it --entrypoint /bin/true internetsystemsconsortium/bind9:9.16
$ time docker run --rm -it --entrypoint /bin/true internetsystemsconsortium/bind9:9.16

real    0m0.612s
user    0m0.045s
sys     0m0.055s

@alexzorin
Copy link
Copy Markdown
Collaborator Author

Ah, that's with the image is already downloaded. Perhaps after docker image prune -a.

I am trying the same thing in NYC1 region at the moment.

@adferrand
Copy link
Copy Markdown
Collaborator

Well, with prune:

$ time docker run --rm -it --entrypoint /bin/true internetsystemsconsortium/bind9:9.16
Unable to find image 'internetsystemsconsortium/bind9:9.16' locally
9.16: Pulling from internetsystemsconsortium/bind9
54ee1f796a1e: Pull complete
f7bfea53ad12: Pull complete
46d371e02073: Pull complete
b66c17bbf772: Pull complete
397f607c4492: Pull complete
b0fcfe8aaf93: Pull complete
48fb8b248074: Pull complete
3f7bf47434c7: Pull complete
96362b0e08fb: Pull complete
Digest: sha256:332d41e7b319d5b1b559d20978e2e476097645471f40c3f9037d6144474ecbe6
Status: Downloaded newer image for internetsystemsconsortium/bind9:9.16

real    0m9.446s
user    0m0.084s
sys     0m0.044s

But without prune, the image was already there so the docker should run almost instantly on the second tox invocation. The docker is certainly not starting correctly.

@alexzorin
Copy link
Copy Markdown
Collaborator Author

Hmmm. It worked for me on the first go on DigitalOcean NYC1, Ubuntu 20.04.

Installed python3 python3-venv git and Docker from https://docs.docker.com/engine/install/ubuntu/, then tools/venv3.py, source venv3/bin/activate and tox -e integration-dns-rfc2136.

Are you doing anything different?

@adferrand
Copy link
Copy Markdown
Collaborator

adferrand commented Nov 16, 2020

I put aside WSL 2, since I know that Docker for Windows has some problems to work correctly in that context (one of the main reason why I designed certbot integration tests to run without any docker when Pebble is involved).

For DigitalOcean, I have almost the same setup, but installed Docker with snap, and my droplet is in Amsterdam. I retry with the non-snap installation.

@adferrand
Copy link
Copy Markdown
Collaborator

I put aside WSL 2, since I know that Docker for Windows has some problems to work correctly in that context (one of the main reason why I designed certbot integration tests to run without any docker when Pebble is involved).

For DigitalOcean, I have almost the same setup, but installed Docker with snap, and my droplet is in Amsterdam. I retry with the non-snap installation.

I confirm it is working with a non-snap installation of docker.

@alexzorin
Copy link
Copy Markdown
Collaborator Author

alexzorin commented Nov 16, 2020

OK. If I toggle show_output=True in dns_server.py, we get:

16-Nov-2020 09:08:53.094 starting BIND 9.16.6-Ubuntu (Stable Release) <id:25846cf>
16-Nov-2020 09:08:53.094 running on Linux x86_64 5.4.0-51-generic #56-Ubuntu SMP Mon Oct 5 14:28:49 UTC 2020
16-Nov-2020 09:08:53.094 built with '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=/usr/include' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-silent-rules' '--libdir=/usr/lib/x86_64-linux-gnu' '--libexecdir=/usr/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--disable-dependency-tracking' '--libdir=/usr/lib/x86_64-linux-gnu' '--sysconfdir=/etc/bind' '--with-python=python3' '--localstatedir=/' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-gost=no' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-libidn2' '--with-libjson-c' '--with-lmdb=/usr' '--with-gnu-ld' '--with-maxminddb' '--with-atf=no' '--enable-ipv6' '--enable-rrl' '--enable-filter-aaaa' '--disable-native-pkcs11' '--enable-dnstap' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fdebug-prefix-map=/build/bind9-IhPNrB/bind9-9.16.6=. -fstack-protector-strong -Wformat -Werror=format-security -fno-strict-aliasing -fno-delete-null-pointer-checks -DNO_VERSION_DATE -DDIG_SIGCHASE' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2'
16-Nov-2020 09:08:53.094 running as: named -g -c /etc/bind/named.conf -u bind
16-Nov-2020 09:08:53.094 compiled by GCC 9.3.0
16-Nov-2020 09:08:53.094 compiled with OpenSSL version: OpenSSL 1.1.1f  31 Mar 2020
16-Nov-2020 09:08:53.094 linked to OpenSSL version: OpenSSL 1.1.1f  31 Mar 2020
16-Nov-2020 09:08:53.094 compiled with libxml2 version: 2.9.10
16-Nov-2020 09:08:53.094 linked to libxml2 version: 20910
16-Nov-2020 09:08:53.094 compiled with json-c version: 0.13.1
16-Nov-2020 09:08:53.094 linked to json-c version: 0.13.1
16-Nov-2020 09:08:53.094 compiled with zlib version: 1.2.11
16-Nov-2020 09:08:53.094 linked to zlib version: 1.2.11
16-Nov-2020 09:08:53.094 ----------------------------------------------------
16-Nov-2020 09:08:53.094 BIND 9 is maintained by Internet Systems Consortium,
16-Nov-2020 09:08:53.094 Inc. (ISC), a non-profit 501(c)(3) public-benefit
16-Nov-2020 09:08:53.094 corporation.  Support and training for BIND 9 are
16-Nov-2020 09:08:53.094 available at https://www.isc.org/support
16-Nov-2020 09:08:53.094 ----------------------------------------------------
16-Nov-2020 09:08:53.094 found 4 CPUs, using 4 worker threads
16-Nov-2020 09:08:53.094 using 4 UDP listeners per interface
16-Nov-2020 09:08:53.098 using up to 21000 sockets
16-Nov-2020 09:08:53.102 loading configuration from '/etc/bind/named.conf'
16-Nov-2020 09:08:53.102 open: /etc/bind/named.conf: file not found

which I guess makes sense because Docker is not a classic snap? Well, I don't know how Docker works on snap, but it is not surprising that the volume mounts are not working as expected anyhow.

Would you like this to work for the PR to land? The fix is potentially simple according to https://github.com/docker-snap/docker-snap#usage:

All files that docker needs access to should live within your $HOME folder.

At the moment certbot-ci workspace is in /tmp; I could add some code to temporary copy to $HOME (though whether $HOME actually points to a writable directory in every test environment, I'm not sure).

@adferrand
Copy link
Copy Markdown
Collaborator

Ah yes! Several points in favor of using a temporary workspace in the home user:

  • same problem for Boulder, which is using a docker-compose stack
  • I already encountered this issue, you did also and someone else could in the future
  • proper setup and cleanup of workspaces is already carried out by the pytest mechanics

You do not even need to copy the data. Everything is refered relatively to self._workspace in ACMEServer and self.bind_root in DNSServer. So we could change the location of these folders to something under ~/.certbot-ci for instance.

If it is that simple, I think it is worth a try.

@alexzorin
Copy link
Copy Markdown
Collaborator Author

Yuck, we would have to end up sniffing for snapped Docker anyway because docker-compose gets renamed to docker.compose. I'm leaning against doing this, especially if we don't have any users (such as ourselves) who are going to regularly test that it works across the whole test suite.

@adferrand
Copy link
Copy Markdown
Collaborator

Yes, I agree.

@adferrand
Copy link
Copy Markdown
Collaborator

adferrand commented Nov 16, 2020

Also the tempfile module in python honors the TMPDIR env variable if it is set. So it is still possible to globally control the temporary directory location for a given system by setting this variable.

Copy link
Copy Markdown
Collaborator

@adferrand adferrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@adferrand adferrand merged commit 9055792 into certbot:master Nov 17, 2020
@bmw
Copy link
Copy Markdown
Member

bmw commented Nov 17, 2020

Thanks a lot for tackling this!

orangepizza added a commit to orangepizza/certbot that referenced this pull request Dec 16, 2020
* Implements support for ECDSA keys. Fixes certbot#2163.

Thanks to @pahrohfit and @Tomoyuki-GH for previous efforts to implement
suport for this.

Co-Authored-By: Robert Dailey <rob@wargam.es>
Co-Authored-By: Tomoyuki-GH <55397638+Tomoyuki-GH@users.noreply.github.com>

* Handle unexpected key type migration. (certbot#8435)

Fixes certbot#8365

This PR adds a control when `certbot certonly` or `certbot run` are called for a certificate that already exists and would eventually be replaced. As described in certbot#8365, this control is here to ensure that the user will not modify the key type of their certificate (eg. ECDSA to RSA) without an explicit approval (set explicitly `--cert-name` and `--key-type`), since RSA is the default if not specified.

* Handle unexpected key type migration.

* Update certbot-ci/certbot_integration_tests/certbot_tests/test_main.py

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>

* Add certbot renew --key-type test (certbot#8447)

* Test certbot renew --key-type

* Fix typo

* Use better asserts. Added notes to style guide. (certbot#8451)

* Add --dns-server option in run_acme_server (certbot#7722)

Fixes certbot#7717

This PR adds a `--dns-server` option to the `run_acme_server` test tool, in order to provide an arbitrary DNS server to Pebble or Boulder for the integration tests.

I also take this occasion to make `run_acme_server` a real CLI tool using argparse, and set the `--server-type` (default `pebble`) option as well.

* Set --dns-server flag in run_acme_server

* Default to pebble

* Add documentation

* Configure also Boulder

* cli: improve Obtaining/Renewing wording (certbot#8395)

* cli: improve Obtaining/Renewing wording

* dont use logger, and use new phrasing

* .display_util.notify: dont wrap

As this function is supposed to be an analogue for print, we do not want
it to wrap by default.

* Add certbot-dns-rfc2136 integration testing (certbot#8448)

* tests: add certbot-dns-rfc2136 integration tests

* dont use 'with' form of socket.socket

fixes py2 crash

* address some feedback:

- conftest: make DNS server a global resource
- conftest: add dns_xdist parameter into node config
- conftest: add --dns-server=bind flag
- conftest: if configured, point the ACME server to the DNS server
- dnsserver: make it sort-of compatible with xdist (future-proofing)
- context: parameterize dns-rfc2136 credentials file (future proofing)
- context: reduce dns-rfc2136 propagation time to speed up tests
- tox: add a integration-dns-rfc2136 target
- rfc2136: add a test/zone for subdelegation
- rfc2136: skip tests if no DNS server is configured

* try add integration-dns-rfc2136 to CI

* mock recursive dns via RPZ

* update --dns-server args and tox.ini args

* address more feedback:

- dns_server: rename rfc2136 creds file to .tpl
- dns_server: dont vary dns server port, instead we will vary zone names (certbot#8455)
- dns_server: log error if bind9 fails to stop cleanly
- dns_server: replace assert with raise
- context: remove redundant _worker_id
- context: remove redundant cleanup override
- context: fix seek/flush in credentials context manager
- context: rename skip_if_no_server -> ...bind_server
- context: add newline EOF

* conftest: document _setup_primary_node sideeffects

* ci: rfc2136-integration from standard->nightly

* fix _stop_bind (function was renamed to stop)

* ignore errors from shutil.rmtree during cleanup

* dns_server: check for crash while polling

* remove --dry-run from rfc2136 test

* import print_function

* certbot-ci: fix py2 crash in dns_server

* Read files as binary in crypto_util for crypto.load_certificate. (certbot#8371)

* Flesh out ECDSA documentation (certbot#8464)

* Changelog tweaks.

* Add ECDSA documentation

* Fix typo

* Add Python 3.9 support and tests (certbot#8460)

Fixes certbot#8134.

* Test on Python 3.9.

* Mention Python 3.9 support in changelog.

* s/\( *'Pro.*3\.\)8\(',\)/\18\2\n\19\2/

* undo changes to tox.ini

* Move more tests to Python 3.9

* Update PyYAML and packages which pinned it back

* Upgrade typed-ast

* Use <= to "pin" dnspython

* Fix lint by telling pylint it cannot be trusted

* Disable mypy on RFC plugin

* add comment about <= support

* Fix link typo in README (certbot#8476)

* nginx: fix Unicode crash on Python 2 (certbot#8480)

* nginx: fix py2 unicode sandwich

The nginx parser would crash when saving configuraitons containing
Unicode, because py2's `str` type does not support Unicode.

This change fixes that crash by ensuring that a string type supporting
Unicode is used in both Python 2 and Python 3.

* nginx: add unicode to the integration test config

* update CHANGELOG

* Update changelog for 1.10.0 release

* Release 1.10.0

* Add contents to certbot/CHANGELOG.md for next version

* Bump version to 1.11.0

* Fix changelog typo (certbot#8488)

* fix changelog typo

* remove empty entry

* Deprecate certbot-auto and remove tests

* Completely deprecate certbot-auto

* DeaDeactivate centos6/oraclelinux6 tests

* Remove tests assets

* Remove another test

* Revert "Remove tests assets"

This reverts commit e603afe.

* Undo certbot-auto changes and remove centos6 tests

* Don't deprecate certbot-auto quite yet

* Remove centos6 test farm tests

* undo changes to test farm test scripts

* Deprecate certbot-auto and remove tests

* Completely deprecate certbot-auto

* DeaDeactivate centos6/oraclelinux6 tests

* Remove tests assets

* Remove another test

* Revert "Remove tests assets"

This reverts commit e603afe.

(cherry picked from commit ff3a07d)

* Undo certbot-auto changes and remove centos6 tests

* Don't deprecate certbot-auto quite yet

* Remove centos6 test farm tests

* undo changes to test farm test scripts

(cherry picked from commit e5113d5)

* Fix add deprecated argument (certbot#8500)

Fixes certbot#8495.

To further explain the problem here, `modify_kwargs_for_default_detection` as called in `add` is simplistic and doesn't always work. See certbot#6164 for one other example.

In this case, were bitten by the code https://github.com/certbot/certbot/blob/d1e7404358c05734aaf436ef3c9d709029d62b09/certbot/certbot/_internal/cli/helpful.py#L393-L395

The action used for deprecated arguments isn't in `ZERO_ARG_ACTIONS` so it assumes that all deprecated flags take one parameter.

Rather than trying to fix this function (which I think can only realistically be fixed by certbot#4493), I took the approach that was previously used in `HelpfulArgumentParser.add_deprecated_argument` of bypassing this extra logic entirely. I adapted that function to now call `HelpfulArgumentParser.add` as well for consistency and to make testing easier.

* Rename deprecated arg action class

* Skip extra parsing for deprecated arguments

* Add back test of --manual-public-ip-logging-ok

* Add changelog entry

* Fix changelog typo (certbot#8497)

Co-authored-by: Adrien Ferrand <ferrand.ad@gmail.com>

* Fix add deprecated argument (certbot#8500) (certbot#8501)

Fixes certbot#8495.

To further explain the problem here, `modify_kwargs_for_default_detection` as called in `add` is simplistic and doesn't always work. See certbot#6164 for one other example.

In this case, were bitten by the code https://github.com/certbot/certbot/blob/d1e7404358c05734aaf436ef3c9d709029d62b09/certbot/certbot/_internal/cli/helpful.py#L393-L395

The action used for deprecated arguments isn't in `ZERO_ARG_ACTIONS` so it assumes that all deprecated flags take one parameter.

Rather than trying to fix this function (which I think can only realistically be fixed by certbot#4493), I took the approach that was previously used in `HelpfulArgumentParser.add_deprecated_argument` of bypassing this extra logic entirely. I adapted that function to now call `HelpfulArgumentParser.add` as well for consistency and to make testing easier.

* Rename deprecated arg action class

* Skip extra parsing for deprecated arguments

* Add back test of --manual-public-ip-logging-ok

* Add changelog entry

(cherry picked from commit 5f73274)

* Update changelog for 1.10.1 release

* Release 1.10.1

* Add contents to certbot/CHANGELOG.md for next version

* Bump version to 1.11.0

* cli: clean up `certbot renew` summary (certbot#8503)

* cli: clean up `certbot renew` summary

- Unduplicate output which was being sent to both stdout and stderr
- Don't use IDisplay.notification to buffer output
- Remove big "DRY RUN" guards above and below, instead change language
  to "renewal" or "simulated renewal"
- Reword "Attempting to renew cert ... produced an unexpected error"
  to be more concise.

* add newline to docstring

Co-authored-by: ohemorange <ebportnoy@gmail.com>

Co-authored-by: ohemorange <ebportnoy@gmail.com>

* Update both main VA and remote VA to use the provided DNS server (certbot#8467)

* dns-google: improve credentials error message (certbot#8482)

This adds a 'Error parsing credentials file ...' wrapper to any errors
raised inside certbot-dns-google's usage of oauth2client, to make it
obvious to the user where the problem lies.

* Removed some unused imports. (certbot#8424)

These were not annotated as something that should be ignored, and the test-suite
passes with these changes.

* snap: disable the "user site-packages directory" (certbot#8509)

Although Certbot is a classic snap, it shouldn't load Python code from
the host system. This change prevents packages being loaded from the
"user site-packages directory" (PEP-370). i.e. Certbot will no longer
load DNS plugins installed via `pip install --user certbot-dns-*`.

* add coverage testing to dns-rfc2136 integration (certbot#8469)

* add coverage testing to dns-rfc2136 integration

* add coverage rule for certbot/* as well

* Completely deprecate certbot-auto (certbot#8489)

Fixes certbot#8296

* Completely deprecate certbot-auto

* Add changelog

* Deprecate support for Python 2 (certbot#8491)

Fixes certbot#8388

* Deprecate support for Python 2

* Ignore deprecation warning

* Update certbot/CHANGELOG.md

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>

* Add reminders to update documentation (certbot#8518)

* Add documentation PR checklist item.

* Update contributing doc

* Avoid --system-site-packages during the snap build by preparing a venv with pipstrap that already includes wheel (certbot#8445)

This PR proposes an alternative configuration for the snap build that avoid the need to use `--system-site-package` when constructing the virtual environment in the snap.

The rationale of `--system-site-package` was that by default, snapcraft creates a virtual environment without `wheel` installed in it. However we need it to build the wheels like `cryptography` on ARM architectures. Sadly there is not way to instruct snapcraft to install some build dependencies in the virtual environment before it kicks in the build phase itself, without overriding that entire phase (which is possible with `parts.override-build`).

The alternative proposed here is to not override the entire build part, but just add some preparatory steps that will be done before the main actions handled by the `python` snap plugin. To do so, I take advantage of the `--upgrade` flag available for the `venv` module in Python 3. This allows to reuse a preexisting virtual environment, and upgrade its component. Adding a flag to the `venv` call is possible in snapcraft, thanks to the `SNAPCRAFT_PYTHON_VENV_ARGS` environment variable (and it is already used to set the `--system-site-package`).

Given `SNAPCRAFT_PYTHON_VENV_ARGS` set to `--upgrade` , we configure the build phase as follows:
* create the virtual environment ourselves in the expected place (`SNAPCRAFT_PART_INSTALL`)
* leverage `tools/pipstrap.py` to install `setuptools`, `pip`, and of course, `wheel`
* let the standard build operations kick in with a call to `snapcraftctl build`: at that point the `--upgrade` flag will be appended to the standard virtual environment creation, reusing our crafted venv instead of creating a new one.

This approach has also the advantage to invoke `pipstrap.py` as it is done for the other deployable artifacts, and for the PR validations, reducing risks of shifts between the various deployment methods.

* Deprecate support of Apache 2.2 in certbot-apache (certbot#8516)

Fixes certbot#8462

* Deprecate support of Apache 2.2 in certbot-apache

* Add a changelog

* Add finish_release flags and CLI parsing (certbot#8522)

* Setup a timeout to the remote snap build process (certbot#8484)

This PR adds a `--timeout` flag to `tools/snap/build_remote.py` in order to fail the process if the time execution reaches the provided timeout. It is set to 5h30 on the relevant Azure job, while the job itself has a timeout of 6h managed on Azure side. This allows a slightly better output for these jobs when the snapcraft build stales for any reason.

* add OS package warning (certbot#8533)

* Make our test farm tests instances self-destruct (certbot#8536)

* remove unused user data

* have instance self-destruct in case cleanup fails

* correct kwargs

* fix param order

* remove CentOS 6 cruft from test farm tests (certbot#8534)

* Add path to certbot executable in debug log (certbot#8538)

* Enable again build isolation with proper pinning of build dependencies (certbot#8443)

Fixes certbot#8256

First let's sum up the problem to solve. We disabled the build isolation available in pip>=19 because it could potential break certbot build without a control on our side. Basically builds are not reproductible. Indeed the build isolation triggers build of PEP-517 enabled transitive dependencies (like `cryptography`) with the build dependencies defined in their `pyproject.toml`. For `cryptography` in particular these requirements include `setuptools>=40.6.0`, and quite logically pip will install the latest version of `setuptools` for the build. And when `setuptools` broke with the version 50, our build did the same.

But disabling the build isolation is not a long term solution, as more and more project will migrate on this approach and it basically provides a lot of benefit in how dependencies are built.

The ideal solution would be to be able to apply version constraints on our side on the build dependencies, in order to pin `setuptools` for instance, and decide precisely when we upgrade to a newer version. However for now pip does not provide a mechanism for that (like a `--build-constraint` flag or propagation of existing `--constraint` flag).

Until I saw pypa/pip#9081 and pypa/pip#8439.

Apart the fact that pypa/pip#9081 shows that pip maintainers are working on this issue, it explains how pip works regarding PEP-517 and infers which workaround can be used to still pin the build dependencies. It turns out that pip invokes itself in each build isolation to install the build dependencies. It means that even if some flags (like `--constraint`) are not explicitly passed to the pip sub call, the global environment remains, in particular the environment variables.

Thus it is known that every pip flag can alternatively be set by environment variable using the following pattern for the variable name: `PIP_[FLAG_NAME_UPPERCASE]`. So for `--constraint`, it is `PIP_CONSTRAINT`. And so you can pass a constraint file to the pip sub call through that mechanism.

I made some tests with a constraint file containing pinning for `setuptools`: indeed under isolation zone, the constraint file has been honored and the provided pinned version has been used to build the dependencies (I tested it with `cryptography`).

Finally this PR takes advantage of this mechanism, by setting `PIP_CONSTRAINT` to `pip_install`, the snap building process, the Dockerfiles and the windows installer building process.

I also extracted out the requirements of the new `pipstrap.py` to be reusable in these various build processes.

* Use workaround to fix build requirements in build isolation, and renable build isolation

* Clean imports in pipstrap

* Externalize pipstrap reqs to be reusable

* Inject pipstrap constraints during pip_install

* Update docker build

* Update snapcraft build

* Prepare installer build

* Fix pipstrap constraints in snap build

* Add back --no-build-cache option in Docker images build

* Update snap/snapcraft.yaml

* Use proper flags with pip

Co-authored-by: Brad Warren <bmw@users.noreply.github.com>

* Added certbot-ci to lint section. Silenced and fixed linting warnings. (certbot#8450)

* remove reference to letsencrypt(-auto) (certbot#8531)

* Clean up certbot-auto docs (certbot#8532)

Fixes certbot#8519.

I left the `certbot-auto` docs in `install.rst` to avoid breaking links and to help propagate information about our changes there. I moved it closer to the bottom of the doc though since I think our documentation about OS packages and Docker is more helpful to most people.

* clean up certbot-auto docs

* add more info to changelog

* remove more certbot-auto references

Co-authored-by: Mads Jensen <mje@inducks.org>
Co-authored-by: Robert Dailey <rob@wargam.es>
Co-authored-by: Tomoyuki-GH <55397638+Tomoyuki-GH@users.noreply.github.com>
Co-authored-by: Adrien Ferrand <adferrand@users.noreply.github.com>
Co-authored-by: Brad Warren <bmw@users.noreply.github.com>
Co-authored-by: alexzorin <alex@zorin.id.au>
Co-authored-by: Brad Warren <bmw@eff.org>
Co-authored-by: ohemorange <ebportnoy@gmail.com>
Co-authored-by: Adrien Ferrand <ferrand.ad@gmail.com>
Co-authored-by: alexzorin <alex@zor.io>
Co-authored-by: osirisinferi <github@flut.demon.nl>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integration tests for certbot-dns-rfc2136

3 participants