mgr/dashboard: Simplify authentication protocol#54710
Conversation
|
issue with more background information: https://tracker.ceph.com/issues/63529 just for the record - we didn't have issues with upgrades, the dashboard is simply not functional with more recent versions of pyo3 (such as the one in bookworm or later). we don't use the dashboard whatsoever. the cephadm component also uses python-cryptography FWIW, although I am not sure whether that part runs in the context of ceph-mgr(-dashboard) or not. |
|
thank you guys for the context here. I'll test this out locally in the coming days. I have a query here, this looks like an error in the cryptography package. and I have seen some threads saying this might be fixed in the latest cryptography and some indicating that pyo3 itself might be fixing the issue. does that sounds true to you guys since you guys spend more time looking at the error than me? Also, its possible that other packages can also affected by this and we might not be able to remove those dependencies. But that looks like a discussion for future (very near future) i guess. |
|
proper support for sub interrpeters on the pyo3 side is still far off - currently, the check only got changed from being over-zealous to actually only checking for different sub interpreters. but ceph explicitly uses such sub interpreters to have different contexts for each module, so it also doesn't work with the improved upstream packages. and yes, basically everything using pyo3 (or the cryptography module, which uses pyo3) is affected when using sub interpreters. |
epuertat
left a comment
There was a problem hiding this comment.
Thanks a lot @kalaspuffar for the contribution. I just left some comments around, also for the @ceph/dashboard team to review.
I'm hesitant about this change, as it hard-codes some external dependency inside Dashboard code, which prevents us from benefiting of any future CVE reports and fixes from the original python-JWT project.
Isn't there any move/workaround from the cryptography contributors to cope with this breaking change?
src/pybind/mgr/dashboard/tox.ini
Outdated
| deps = | ||
| -rrequirements.txt | ||
| -cconstraints.txt | ||
| PyJWT |
There was a problem hiding this comment.
Why is this still required here?
There was a problem hiding this comment.
I've added it here because we use it in the QA testing framework. Using it to test the dashboard could be a benefit to verify that the new code is following the specifications. We could change the QA test to use our code and remove this dependency in the future if we want.
There was a problem hiding this comment.
Ah, got it. Makes sense. There's a requirements-test.txt file for test-only deps (and we should ensure pinning that to a specific version).
That said, I'd only keep this dep if we actually used in a new test added to tests/test_auth.py in order to validate our custom JWT encoding/decoding with an external implementation.
There was a problem hiding this comment.
Hi @epuertat
So if qa/tasks/mgr/dashboard/test_auth.py uses it I'm not allowed to keep that dependency? Then how do I add it so that the test can use the dependency? Does the QA part of the code have a separate dependency file?
Best regards
Daniel
There was a problem hiding this comment.
If I am not wrong here, adding it to the requirements-test.txt will also pull these dependencies in our qa suites.
There was a problem hiding this comment.
BTW, everything under qa/ dir is integration/e2e (teuthology) testing. Just to verify a codec, it's a total overkill to rely on that (it requires building packages for all supported distros, which takes 2-3 hours, and then deploy a real Ceph cluster).
That would be the perfect example of a unit tests: you feed the function under test with a specific set of inputs, and compare the output provided with the expected output (and you don't even need the jwt module imported there, just to hard-code a list of static inputs vs outputs).
There was a problem hiding this comment.
|
BTW, we should also remove the python-JWT dep from the |
nizamial09
left a comment
There was a problem hiding this comment.
So far I've tested this with the dashboard and it looks good to me!
As @epuertat mentioned we have these python-jwt (for fedora...) and PyJWT (for suse) dependencies provided in the ceph.spec.in and debian/control, so probably that needs to be removed as well!
Great, I've made the changes locally and I'm running the tests. Will push when done. |
By removing the dependency to PyJWT we also remove the dependency to the cryptographic library which in the dashboard module will create a crash. In newer implementations of the library PyO3 is used to run rust code in order to encrypt with Elliptic Curves. This is never used in the dashboard communication so a much simpler implementation where we only use the hmac sha256 algorithm to create the signed JWT message could be used. Fixes: https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371 Signed-off-by: Daniel Persson <mailto.woden@gmail.com>
a8cb685 to
c616a9d
Compare
|
Hi @epuertat and @nizamial09 I've now made the changes suggested by @epuertat. Best regards |
nizamial09
left a comment
There was a problem hiding this comment.
Looks good to me! I'll add some more people here to test this out locally. Thank you @kalaspuffar
|
shaman builds are passing ✔️ |
|
@kalaspuffar can you plz update the last 2 follow-up commit titles to be consistent with the community guidelines? Thanks! |
Move the JWT requirement to the test requirements file. Also remove JWT from ceph specification and debian build. Signed-off-by: Daniel Persson <mailto.woden@gmail.com>
Seemed that the test dependencies was separated in two different requirements files one for the testing and one for linting. Added the JWT dependency in the linting file as well. Signed-off-by: Daniel Persson <mailto.woden@gmail.com>
1eb8ec5 to
06765e6
Compare
I hope these amendments to the commit messages are more in line with the guidelines. Another thing we could consider is using the first commits message and squash the other commits into one. Best regards |
Yup, these are ok to me! Thanks @kalaspuffar ! |
|
jenkins test windows |
|
Unfortunately this only fixed the dashboard module. Most of the rest of the modules use bcrypt and fail to load. In particular |
This could related to pyca/bcrypt#694 which I had been reported. Most likely bcrypt side should already give enough fix, but need PyO3 side for sub-interpreter support (seems no real timeline schedule yet...) So ideally if ceph-mgr related don't really need bcrypt support we better remove it ;-( |
After upgrading from PVE 7 to PVE 8, some users noted that the Ceph Dashboard does not work anymore. [0] A user from our community provided a pull request [1] which removes a dependency to `PyJWT` (Python). This commit adds a backport of this PR as a single patch. This patch by itself however does not yet allow the dashboard to run with TLS enabled. [0]: https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/ [1]: ceph/ceph#54710 Signed-off-by: Max Carrara <m.carrara@proxmox.com>
|
Uh, fun with sub-interpreters again. Seems really to be a never ending source of joy |
|
Meta: it's now December 2024,
This is, by the way, a blocker: the ceph dsahboard (in reef) cannot be installed. Period, End of story. I currently see five alternatives:
I conclude the only viable solution is the first bullet: Is there anything else? I received this comment:
Based on this comment, I conclude the use of python subinterpreters in Ceph was a design flaw. |
|
Hi @linas I've noticed this issue and have it in our environment today. So, there is no way we can upgrade our production environment. But we run our development environment on this code, so I have the problem in question. I wanted to discuss this issue with the team during Cephalocon this week which I will attend, hope to see you all there. But my main points are that this code is either unnecessary or something we need to rebuild to be in line with the community. When I last reviewed the code, we had added new code, so the crypto library was not only used for the login function, which I rewrote because it did not need that library. But now we have added nice-to-have functionality to create and verify SSL certificates in Python code. That is not something that is strictly necessary, but it is great to simplify things for the customer. These use the library OpenSSL and have recently started to use the problematic code from cryptography / PyO3. I see three different approaches we could pursue.
I'm looking forward to seeing you soon. Best regards |
|
Hi Daniel, I will not be at Cephalocon; I'm just an ordinary private user, running ceph for myself, and not for some company. Have no money for travel.
|
|
Hey there, just wanted to weigh in here regarding the Python sub-interpreter issue and perhaps clarify a few things -- I made that tracking issue PyO3/pyo3#3451 a while ago, and yes, unfortunately I've been getting delayed and delayed again due to issues I won't elaborate on right now. But nevertheless, this will move forward eventually. Though, I want to clarify, I'm not a maintainer of PyO3 (and neither did I get overwhelmed 😉) -- one of the reasons why the RFC PR PyO3/pyo3#4162 I had submitted didn't see any more progress is because the PyO3 maintainers focused their efforts on bringing freethreading support to PyO3 (see e.g. PyO3/pyo3#4588) and I had to attend to other matters in the meantime as well. So, since both the maintainers and I were busy elsewhere, naturally there was no movement forward on the whole sub-interpreter ordeal. With all that being said, I'm in the process of rebasing my PyO3 branch. But I'd rather not over-promise anything and instead soon deliver something more concrete and lay out a path forward. |
|
|
Hi @epuertat We talked about it at Cephalocon and did not really come to any conclusions, but I think either a new module or perhaps moving the functionality into the RestAPI module, and then anyone needing that functionality could just make a call to the already established APIs to do those kinds of workloads. Best regards |
This is a backport of ceph#54710 which fixes the Ceph Dashboard not being able to launch on Ceph Reef running on Debian Bookworm. This is achieved by removing the dependency on `PyJWT` (Python) and thus transitively also removing the dependency on `cryptography` (Python). For more information, see the original pull request. Note that the Ceph Dashboard still cannot be used if TLS is activated, because `pyOpenSSL` is used to verify certs during launch. Disabling TLS via `ceph config set mgr mgr/dashboard/ssl false` and using e.g. a reverse proxy can be used as a workaround. A separate patch is required to allow the dashboard to run with TLS enabled. Fixes: https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371 Signed-off-by: Daniel Persson <mailto.woden@gmail.com> Signed-off-by: Max Carrara <m.carrara@proxmox.com>
Hi Team
I got introduced to the thread mentioned below where Proxmox had issues with upgrades for multiple months going to bookworm. They have packaged Ceph themselves but could not get the JWT implementation in the dashboard to work because of Python Interpreters not liking the rust code that ran under the hood in the crypto library.
From the thread:
JWT is a pretty simple specification and Ceph dashboard only use one encryption algorithm that is already available in Python so why not replace it with a small implementation of the encode and decode functions.
By removing the dependency to PyJWT we also remove the dependency to the cryptographic library, which in the dashboard module will create a crash. In newer implementations of the library PyO3 is used to run rust code in order to encrypt with Elliptic Curves. This is never used in the dashboard communication so a much simpler implementation where we only use the hmac sha256 algorithm to create the signed JWT message could be used.
Fixes: https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371
Impacts: Dashboard
No doc update is appropriate because it's a simplification of code.
Added no extra tests.