Skip to content

feat: validate new certificate on connect#3551

Merged
didier-wenzek merged 16 commits intothin-edge:mainfrom
didier-wenzek:feat/new-certificate-validation
Apr 16, 2025
Merged

feat: validate new certificate on connect#3551
didier-wenzek merged 16 commits intothin-edge:mainfrom
didier-wenzek:feat/new-certificate-validation

Conversation

@didier-wenzek
Copy link
Copy Markdown
Contributor

@didier-wenzek didier-wenzek commented Apr 9, 2025

Proposed changes

Let thin-edge maintain 2 certificates (current and new) for each cloud profile.
The paths to these certificates are derived from the device.cert_path tedge config setting.

  • "$(tedge config get device.cert_path)" is the path to the certificate currently used to connect the cloud endpoint
  • "$(tedge config get device.cert_path).new" is the path to a new certificate still to be validated (if any).

The command tedge cert renew does no more replace the current certificate but store the new certificate in its own file for further validation. The promotion of a new certificate as the current one is done by the tedge connect command after a successful connection.

  • tedge cert renew doesn't replace the current certificate but creates a new file.
  • tedge connect c8y validates any new certificate before promoting it as the current certificate
  • tedge connect az validates any new certificate before promoting it as the current certificate
  • tedge connect aws validates any new certificate before promoting it as the current certificate
  • 'tedge cert show --new' displays the new certificate instead of the current one
  • tedge connect --offline ignores with warning any new certificate
  • tedge cert upload uploads the new certificate when there is one
  • tedge cert remove removes removes new and current certificates
  • tedge cert needs-renewal --new checks the new certificate if any. useless
  • Should tedge cert upload uploads the current and the new certificate when there are both? no, useless
  • On success, tedge cert renew shows the new certificate
  • On success, tedge cert renew warns that the device must be reconnected
  • tedge cert renew uses the CA given by the --ca option (c8y by default)

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

#3541

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s. You can activate automatic signing by running just prepare-dev once)
  • I ran just format as mentioned in CODING_GUIDELINES
  • I used just check as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

@didier-wenzek didier-wenzek added improvement User value theme:cli Theme: cli related topics theme:certificates Theme: Device certificate topics labels Apr 9, 2025
@didier-wenzek didier-wenzek force-pushed the feat/new-certificate-validation branch from bea5756 to dad9115 Compare April 9, 2025 12:25
@didier-wenzek didier-wenzek temporarily deployed to Test Pull Request April 9, 2025 12:25 — with GitHub Actions Inactive
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2025

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2025

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass % ⏱️ Duration
613 0 3 613 100 1h48m20.13762s

@reubenmiller
Copy link
Copy Markdown
Contributor

@didier-wenzek I just tried out some of the commands (what is implemented so far), and it works nicely...it will definitely simplify the integration with custom renewers.

We can discuss the open points shortly, to finalize the direction of the PR.

@reubenmiller
Copy link
Copy Markdown
Contributor

reubenmiller commented Apr 10, 2025

My thoughts on the open points:

  • tedge cert needs-renewal --new checks the new certificate if any. Easy to implement, but is this useful?

No, we don't need this. The .new cert is only created as a by-product of the current certificate needing renewal. The .new cert will be checked anyway upon reconnection...and generally the .new file will be cleaned up after such failed attempt (we need to work out who is responsible for the removal of it)

  • Should tedge cert upload uploads the current and the new certificate when there are both?

I don't think this will be required. As the tedge cert upload won't work anyway during cert renewals (as you need Tenant Admin priv.), so supporting some fallback is probably overkill for now. We can always add this later if required.

  • tedge cert remove removes the new certificate when there is one. Is this useful?

This should remove both the current cert (which is the expected output when executing it), plus deleting the .new file (if present).

And another topic that came up during my quick usage of the feature:

  • Who/What is responsible for deleting the .new certificate if the connection fails?
  • What should the exit code be when using tedge reconnect c8y but switching to the .new certificate fails (for whatever reason). Ideally it should be a non-zero exit code. See the shell snippets for how having a non-zero exit code would help...

Example renewal scripts

####################################################################################
# Method 1: Using Cumulocity certificate-authority
####################################################################################
tedge renew c8y
if ! tedge reconnect c8y; then
    rm -f "$(tedge config get device.cert_path).new"
    exit 1
fi

echo "Successfully connected"

####################################################################################
# Method 2: Using self-signed certificates with some custom microservice to accept
# devices uploading their own self-signed certificate
####################################################################################
tedge renew c8y --self-signed
if ! tedge http post /c8y/service/devicecert/certificates/upload --file "$(tedge config get device.cert_path).new"; then
    echo "Failed to upload new certificate using a custom microservice" >&2
    rm -f "$(tedge config get device.cert_path).new"
    exit 1
fi

if ! tedge reconnect c8y; then
    rm -f "$(tedge config get device.cert_path).new"
    exit 1
fi

echo "Successfully connected"

@didier-wenzek
Copy link
Copy Markdown
Contributor Author

[ ] Who/What is responsible for deleting the .new certificate if the connection fails?

I think the better is to not delete the .new certificate when the connection is not successful.

  • the root cause for the connection failure might be unrelated to the .new certificate
  • the renew command will in any case replace the .new certificate

[ ] What should the exit code be when using tedge reconnect c8y but switching to the .new certificate fails (for whatever reason). Ideally it should be a non-zero exit code. See the shell snippets for how having a non-zero exit code would help...

Offline, we agreed to exit with a specific status code when the connection failed with the .new certificate.

  • 0: successfully connected (using the new certificate, if one was present)
  • 1: failed to connect (possibly with the new certificate despite, this certificate has been successfully validated)
  • 2: command line parsing error (clap)
  • 3: properly connected - but with the previous certificate while a new one is present

@didier-wenzek
Copy link
Copy Markdown
Contributor Author

[ ] What should the exit code be when using tedge reconnect c8y but switching to the .new certificate fails (for whatever reason). Ideally it should be a non-zero exit code. See the shell snippets for how having a non-zero exit code would help...

Offline, we agreed to exit with a specific status code when the connection failed with the .new certificate.

* 0: successfully connected (using the new certificate, if one was present)

* 1: failed to connect (possibly with the new certificate despite, this certificate has been successfully  validated)

* 2: command line parsing error (clap)

* 3: properly connected - but with the previous certificate while a new one is present

Done: 8be4190

@didier-wenzek didier-wenzek marked this pull request as ready for review April 11, 2025 09:40
@reubenmiller
Copy link
Copy Markdown
Contributor

reubenmiller commented Apr 11, 2025

@didier-wenzek Overall it works nicely (especially the new exit code for the tedge reconnect c8y).

There was only 1 item which could be improved:

Item 1: tedge cert upload uploads the new certificate when there is one

If there is a .new certificate and the tedge cert upload c8y uploads the .new certificate instead of the default one, then the command should include some information about which certificate it uploaded, otherwise it is not super clear for the user, not even with --log-level trace.

Below shows the current output when there was both a "main" certificate and a .new certificate.

# create some initial certificate
tedge cert create --device-id example01

# copy the cert (so both the cert and .new cert exist)
cp /etc/tedge/device-certs/tedge-certificate.pem /etc/tedge/device-certs/tedge-certificate.pem.new

# remove the current cert and create a new cert with a new device-id
tedge cert remove
tedge cert create --device-id example02

# upload to Cumulocity (though the output does not show which cert is uploaded)
tedge cert upload c8y
Certificate uploaded successfully.

@rina23q
Copy link
Copy Markdown
Member

rina23q commented Apr 11, 2025

tedge connect c8y error message looks repeating.

Checking new certificate validity: /etc/tedge/device-certs/tedge-certificate.pem.new... ✗
error: Connection error while creating device in Cumulocity: Mqtt state: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: IO: received fatal alert: CertificateUnknown: received fatal alert: CertificateUnknown
Error validating the new certificate: Connection error while creating device in Cumulocity: Mqtt state: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: IO: received fatal alert: CertificateUnknown: received fatal alert: CertificateUnknown
  => keep using the current certificate unchanged /etc/tedge/device-certs/tedge-certificate.pem
Creating device in Cumulocity cloud... ✗
error: Connection error while creating device in Cumulocity: Mqtt state: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: IO: received fatal alert: CertificateUnknown: received fatal alert: CertificateUnknown

The steps I have done is below. I didn't upload both old and new self-signed certificate with intention to cause an error.

root@11d89e85c51e:/setup# tedge cert remove
Certificate was successfully removed

root@11d89e85c51e:/setup# tedge cert create --device-id aaa
Certificate was successfully created

Certificate:   /etc/tedge/device-certs/tedge-certificate.pem
Subject:       CN=aaa, O=Thin Edge, OU=Device
Issuer:        CN=aaa, O=Thin Edge, OU=Device
Status:        VALID (expires in: 11months 29days 3h 50m 23s)
Valid from:    Thu, 10 Apr 2025 13:04:23 +0000
Valid until:   Fri, 10 Apr 2026 13:04:23 +0000
Serial number: 525188842049310620321124847034595044436808047873 (0x5bfe483c5c128f4bc6d974814f1e3f42ec2c1d01)
Thumbprint:    B2B6EBA9118F82A3FF188F5960D7EB8D46F36B3A

root@11d89e85c51e:/setup# tedge cert renew --self-signed
Certificate was successfully renewed, for un-interrupted service, the certificate has to be uploaded to the cloud

# I didn't upload both old and new certificates
root@11d89e85c51e:/setup# tedge connect c8y
connect to Cumulocity cloud.:
        device id: aaa
        cloud profile: <none>
        cloud host: rina.xxx.xxx:8883.  #masked
        auth type: Certificate
        certificate file: /etc/tedge/device-certs/tedge-certificate.pem
        cryptoki: false
        bridge: mosquitto
        service manager: systemd
        mosquitto version: 2.0.11
Checking new certificate validity: /etc/tedge/device-certs/tedge-certificate.pem.new... ✗
error: Connection error while creating device in Cumulocity: Mqtt state: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: IO: received fatal alert: CertificateUnknown: received fatal alert: CertificateUnknown
Error validating the new certificate: Connection error while creating device in Cumulocity: Mqtt state: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: IO: received fatal alert: CertificateUnknown: received fatal alert: CertificateUnknown
  => keep using the current certificate unchanged /etc/tedge/device-certs/tedge-certificate.pem
Creating device in Cumulocity cloud... ✗
error: Connection error while creating device in Cumulocity: Mqtt state: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: IO: received fatal alert: CertificateUnknown: received fatal alert: CertificateUnknown

@reubenmiller
Copy link
Copy Markdown
Contributor

reubenmiller commented Apr 11, 2025

tedge connect c8y error message looks repeating.

This isn't actually repeating, there is one error per certificate (in this case 2 attempts where both failed). Also, a failed .new certificate is not deleted (this is the responsibility of the caller). In practice, the deletion of the failed .new certificate will be done by the cert renewer service after it receives a non-zero exit code from the tedge reconnect c8y command.

@rina23q
Copy link
Copy Markdown
Member

rina23q commented Apr 11, 2025

Sorry what I meant "repeating" is a different point... one error message contains 4 times received fatal alert: CertificateUnknown. Is it actually one error that has many nests?

error: Connection error while creating device in Cumulocity: Mqtt state: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: Mqtt serialization/deserialization error: IO: received fatal alert: CertificateUnknown: IO: received fatal alert: CertificateUnknown: received fatal alert: CertificateUnknown

@didier-wenzek
Copy link
Copy Markdown
Contributor Author

As agreed offline:

  • On success, tedge cert renew shows the new certificate
  • On success, tedge cert renew warns that the device must be reconnected
  • tedge cert renew uses the CA given by the --ca option (c8y by default)

I also updated tedge cert create and tedge cert download to have a consistent behavior.

Addressed by:

Copy link
Copy Markdown
Contributor

@reubenmiller reubenmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. A very nice feature that makes the certificate renewal process a lot more reliable, and it makes integrating custom renew logic very easy.

Any further improvements can be done in following PRs after we've collected some user feedback.

Copy link
Copy Markdown
Contributor

@albinsuresh albinsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is working nicely. The command outputs are really clear now.

Not for this PR, but just a thought: Should the tedge cert show command give any hints about the presence of the .new certificate if one is present (with an additional hint about how to view it using the --new option)? Since the cert renew command output is sufficiently verbose right now, this isn't such a strong requirement. Just an additional guidance, that's all.

didier-wenzek and others added 16 commits April 16, 2025 09:56
This is a very first re-organization step. The end goal is to be able
to check a new certificate before connecting the cloud and to fallback
to the current one on failure. For that the code of the connect command
has to be re-organize to clearly separate:
- connection checks using the bridge
- connection checks using a direct connection
- bridge configuration and processes restart.

Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
…cate

Signed-off-by: reubenmiller <reuben.d.miller@gmail.com>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
@didier-wenzek didier-wenzek force-pushed the feat/new-certificate-validation branch from 334b684 to 2037802 Compare April 16, 2025 08:41
@didier-wenzek didier-wenzek added this pull request to the merge queue Apr 16, 2025
Merged via the queue into thin-edge:main with commit f2e2201 Apr 16, 2025
51 of 52 checks passed
@didier-wenzek didier-wenzek deleted the feat/new-certificate-validation branch April 16, 2025 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement User value theme:certificates Theme: Device certificate topics theme:cli Theme: cli related topics

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants