Skip to content

Add cert that passes SSL checks#1366

Merged
alxndrsn merged 3 commits intogetodk:nextfrom
yanokwa:fix-ssl-certs
Oct 22, 2025
Merged

Add cert that passes SSL checks#1366
alxndrsn merged 3 commits intogetodk:nextfrom
yanokwa:fix-ssl-certs

Conversation

@yanokwa
Copy link
Member

@yanokwa yanokwa commented Sep 17, 2025

Currently, Central fails SSL Labs' test with

Assessment failed: Internal Server Error

People (including us) refer to this test to show that our configs are secure. The problem seems to be that the dummy cert we use is too dumb.

What has been done to verify that this works as intended?

ChatGPT told me we need X.509 v3, a subjectAltName and be reasonably valid to work with SSL scanners, so I made those changes and confirmed that everything works.

Screenshot 2025-09-16 at 18 58 40

The lack of SNI means we don't support these old browsers: Android 2.3.7, IE 6 / XP, IE 8 / XP, Java 6u45.

Why is this the best possible solution? Were any other approaches considered?

We could get rid of the dummy block that was added at #814 and make the real vhost the default. Not 100% sure why that was strictly necessary to begin with.

We could also add do something like this...

server {
  listen 443 ssl default_server;
  server_name ${CERT_DOMAIN} _;

  ssl_certificate         /etc/${SSL_TYPE}/live/${CERT_DOMAIN}/fullchain.pem;
  ssl_certificate_key     /etc/${SSL_TYPE}/live/${CERT_DOMAIN}/privkey.pem;
  ssl_trusted_certificate /etc/${SSL_TYPE}/live/${CERT_DOMAIN}/fullchain.pem;

  ssl_protocols TLSv1.2 TLSv1.3;
  ssl_prefer_server_ciphers off;
  ssl_session_tickets off;

  # 🔒 Hard stop: only serve the real host
  if ($host != '${CERT_DOMAIN}') { return 421; }

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

It doesn't

Does this change require updates to documentation? If so, please file an issue here and include the link below.

No.

Before submitting this PR, please make sure you have:

  • branched off and targeted the next branch OR only changed documentation/infrastructure (master is stable and used in production)
  • verified that any code or assets from external sources are properly credited in comments or that everything is internally sourced

@yanokwa yanokwa requested a review from alxndrsn September 17, 2025 02:16
@brontolosone
Copy link
Contributor

We could get rid of the dummy block that was added at #814 and make the real vhost the default. Not 100% sure why that was strictly necessary to begin with.

I'm not sure I have the full context, but in case it might help: the idea might be to filter out misdirected requests. It's indeed not strictly necessary if you don't mind any and all requests, with any and all Host: headers, including those by bots, including those for strange domains (clients arriving here by way of DNS records of any number of previous owners of that VPS's IP still pointing at it) to end up in access logs / error logs and (depending on the shape of the request) even hitting Node (which is way more expensive than having it be stopped at the gate, eg by Nginx).

@matthew-white matthew-white changed the base branch from master to next October 3, 2025 17:32
@matthew-white
Copy link
Member

Since this PR changes code, I think we should merge into next rather than master. I just updated the target branch.

I'm also going to add this PR to the project just to keep track of it in the lead-up to releasing v2025.3.

@github-project-automation github-project-automation bot moved this to 🕒 backlog in ODK Central Oct 3, 2025
@matthew-white matthew-white moved this from 🕒 backlog to ✏️ in progress in ODK Central Oct 3, 2025
@matthew-white
Copy link
Member

@yanokwa, it'd be great if this PR could be merged in the next week in order to go out with the v2025.3 release. I see that you tagged @alxndrsn for review, but I bet @brontolosone or Sadiq could review as well.

@alxndrsn

This comment was marked as off-topic.

alxndrsn

This comment was marked as outdated.

@alxndrsn
Copy link
Contributor

alxndrsn commented Oct 14, 2025

Currently, Central fails SSL Labs' test with

Assessment failed: Internal Server Error

@yanokwa are you sure the SSL Labs failure relates to the SSL cert changed in this test? "Internal Server Error" suggests this is more likely a problem with SSL Labs code.

In addition, you can see that the cert served at https://production.getodk.cloud already has CN set (which this PR attempts to do):

$ openssl s_client -showcerts -servername production.getodk.cloud -connect production.getodk.cloud:443 < /dev/null 2>/dev/null | openssl x509 -inform pem -noout -text | grep CN
        Issuer: C = US, O = Let's Encrypt, CN = E7
        Subject: CN = production.getodk.cloud

Viewed in Firefox:

Screenshot_2025-10-14_13-56-07

The changes in this PR relate to the cert served when an unrelated domain is requested from a Central nginx instance, e.g.:

$ openssl s_client -showcerts -servername unrelated-domain.example.com -connect production.getodk.cloud:443 < /dev/null 2>/dev/null | openssl x509 -inform pem -noout -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            19:9e:0d:9e:6d:a9:f0:69:b2:5b:1b:10:c7:90:49:fe:c9:ee:be:b7
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: 
        Validity
            Not Before: Sep 13 18:48:16 2025 GMT
            Not After : Jan 14 18:48:16 3025 GMT
        Subject: 
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)

@alxndrsn
Copy link
Contributor

alxndrsn commented Oct 14, 2025

A couple of similar-looking issues:

  1. Assessment failed: Internal Server Error ssllabs/ssllabs-scan#816

After analyzing the issue it was found that the server is responding differently for SSLv3 protocol detection logic. It is confirmed that the server does not support SSLv3 but the way the server responds to the SSLv3 request is not correct.
If the server does not support a particular protocol/request then it should respond with an "Alert" response

  1. https://success.qualys.com/discussions/s/question/0D52L00006fXul3SAC/assessment-failed-internal-server-error-ssl-server-test

Errors of this type will often be reported for servers that use connection rate limits or block connections in response to unusaual traffic. Problems of this type are very difficult to diagnose. If you have access to the server being tested, please check that there is no rate limitiing or IDS in place.

The linked report (https://www.ssllabs.com/ssltest/analyze.html?d=production.getodk.cloud) says similar to 2:

Failed to obtain certificate and Internal Error - errors of this type will often be reported for servers that use connection rate limits or block connections in response to unusual traffic. Problems of this type are very difficult to diagnose. If you have access to the server being tested, before reporting a problem to us, please check that there is no rate limiting or IDS in place.

@alxndrsn
Copy link
Contributor

Looking at the verbose output of the SSL Labs API:

$ docker run --read-only --cap-drop all --rm jumanjiman/ssllabs-scan:latest -verbosity trace https://production.getodk.cloud
...
{"host":"https://production.getodk.cloud","port":443,"protocol":"http","isPublic":false,"status":"READY","startTime":1760441154576,"testTime":1760441162930,"engineVersion":"2.4.1","criteriaVersion":"2009q","cacheExpiryTime":1760441762930,"endpoints":[{"ipAddress":"35.163.56.100","serverName":"ec2-35-163-56-100.us-west-2.compute.amazonaws.com","statusMessage":"Internal Server Error","statusDetails":"TESTING_PROTO_3_3_NO_SNI","statusDetailsMessage":"Testing TLS 1.2 without SNI","duration":8129,"delegation":1,"details":{"hostStartTime":1760441154576,"certChains":[],"protocols":[],"prefixDelegation":false,"nonPrefixDelegation":true,"zeroRTTEnabled":-1,"zombiePoodle":0,"goldenDoodle":0,"zeroLengthPaddingOracle":0,"sleepingPoodle":0}}],"certs":[]}

It looks like the failure may be while "Testing TLS 1.2 without SNI".

This makes some sense that it might relate to the "wrong host" certificate.

I think you can demonstrate the difference between with/without SNI by comparing the output of these two commands:

printf "GET / HTTP/1.1\r\nHost: production.getodk.cloud\r\nConnection: close\r\n\r\n" | openssl s_client -connect production.getodk.cloud:443 -tls1_2 -noservername
printf "GET / HTTP/1.1\r\nHost: production.getodk.cloud\r\nConnection: close\r\n\r\n" | openssl s_client -connect production.getodk.cloud:443 -tls1_2

From man openssl-s_client:

       -servername name
           Set the TLS SNI (Server Name Indication) extension in the
           ClientHello  message  to the given value.  If -servername
           is not provided, the TLS SNI extension will be  populated
           with  the name given to -connect if it follows a DNS name
           format. If -connect is not provided either,  the  SNI  is
           set  to  "localhost".   This is the default since OpenSSL
           1.1.1.

           Even though SNI should normally be a DNS name and not  an
           IP  address,  if  -servername  is provided then that name
           will be sent, regardless of whether it is a DNS  name  or
           not.

           This   option   cannot   be   used  in  conjunction  with
           -noservername.

       -noservername
           Suppresses sending of the SNI  (Server  Name  Indication)
           extension  in  the ClientHello message. Cannot be used in
           conjunction with  the  -servername  or  -dane_tlsa_domain
           options.

Checking without the -tls1_2 flag, it looks like the problem is not specific to TLS v1.2.

Is HTTPS usage without SNI worth supporting?

@yanokwa
Copy link
Member Author

yanokwa commented Oct 14, 2025

Is HTTPS usage without SNI worth supporting?

I don't know, but I don't think so?

@matthew-white
Copy link
Member

@alxndrsn, would you mind taking another look at this PR once you get the chance? It'd be ideal to get this merged within the next couple of days if possible.

alxndrsn pushed a commit to alxndrsn/odk-central that referenced this pull request Oct 21, 2025
The test for SSLLabs implodes if a Common Name is not set for the cert served for incorrect domains.

See: https://www.ssllabs.com/ssltest/analyze.html?d=production.getodk.cloud

This is a test for the change made in getodk#1366
@alxndrsn alxndrsn mentioned this pull request Oct 21, 2025
2 tasks
@alxndrsn
Copy link
Contributor

Is HTTPS usage without SNI worth supporting?

I don't know, but I don't think so?

I think this is now answered in the main PR message:

The lack of SNI means we don't support these old browsers: Android 2.3.7, IE 6 / XP, IE 8 / XP, Java 6u45.

@alxndrsn alxndrsn merged commit 5cc5e75 into getodk:next Oct 22, 2025
5 checks passed
@github-project-automation github-project-automation bot moved this from ✏️ in progress to ✅ done in ODK Central Oct 22, 2025
alxndrsn added a commit that referenced this pull request Oct 22, 2025
The test for SSLLabs implodes if a Common Name is not set for the cert served for incorrect domains.

This is a test for the change made in #1366.

See: https://www.ssllabs.com/ssltest/analyze.html?d=production.getodk.cloud&clearCache=on
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅ done

Development

Successfully merging this pull request may close these issues.

4 participants