-
Notifications
You must be signed in to change notification settings - Fork 293
[KubejobRuntime] Secret token mounting for KubejobRuntime #9013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
server/py/services/api/launcher.py
Outdated
| # TODO implement tests during ML-11600 | ||
| def _resolve_and_validate_token_name_from_secret(self, run: mlrun.run.RunObject): | ||
| if run.spec.auth.get("token_name"): | ||
| # TODO perform token name validation ML-11600 | ||
| pass | ||
| else: | ||
| # TODO perform token name resolution and validation ML-11600 | ||
| run.spec.auth["token_name"] = "default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate TODO, simplify it. you can also write a function that "validate" the token name and put the todo there.
mlrun/common/constants.py
Outdated
| JOB_TYPE_RERUN_WORKFLOW_RUNNER = "rerun-workflow-runner" | ||
| MLRUN_ACTIVE_PROJECT = "MLRUN_ACTIVE_PROJECT" | ||
|
|
||
| MLRUN_AUTH_SECRET_PATH = "/var/mlrun-secrets/auth" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| MLRUN_AUTH_SECRET_PATH = "/var/mlrun-secrets/auth" | |
| MLRUN_JOB_AUTH_SECRET_PATH = "/var/mlrun-secrets/auth" |
job/functions/everything that mlrun spins of curse but mainly for mlrun run pods as opposed to mlrun IDE or so.
rokatyy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
Just couple of nitpickings and questions
Yacouby
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions
| ) | ||
| return | ||
|
|
||
| run.spec.auth["token_name"] = "default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this break runs if we fall back to "default" even when that token is missing or invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- you are using hardcoded value , what about creating a constant for it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we're always injecting default.
During ML-11600 we'll make the token resolution logic as described in the HLD https://iguazio.atlassian.net/wiki/spaces/MLRUN/pages/416121541/Secret+Token+Job+Function+Mounts+HLD#Token-name-resolution
So this is just temporary and will change.
mlrun/auth/utils.py
Outdated
| :param env: The environment dictionary to enrich. | ||
| :param db: The RunDBInterface instance to retrieve secret tokens. | ||
| :param auth_info: The AuthInfo object containing authentication details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have to update the docstring after removing those 2 params
mlrun/auth/utils.py
Outdated
| env["MLRUN_AUTH_TOKEN_ENDPOINT"] = ( | ||
| mlrun.mlconf.iguazio_api_url + "/api/v1/refresh-access-token" | ||
| env["MLRUN_AUTH_TOKEN_ENDPOINT"] = os.path.join( | ||
| mlrun.mlconf.iguazio_api_url, "/api/v1/refresh-access-token" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is wrong since the second argument starts with a /, it ignores the first part and will always return "/api/v1/refresh-access-token", which is not intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove the leading /
| mocked_responses = self._mock_list_namespaced_pods([[pod]]) | ||
| return mocked_responses[0].items | ||
|
|
||
| def test_mount_secret_token_to_runtime(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to add more tests here as you check only the happy path where the secret exists. you should also cover cases like missing secrets, volume with the same name already exists
+
verifying the generate env variable is set in the runtime env
liranbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments
server/py/services/api/launcher.py
Outdated
|
|
||
| self._handle_retry(run) | ||
| run = self._pre_run_image_pull_secret_enrichment(run) | ||
| self._resolve_and_validate_token_name_from_secret(run) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| self._resolve_and_validate_token_name_from_secret(run) | |
| self._enrich_and_validate_auth_token_name(run) |
server/py/services/api/launcher.py
Outdated
| # for token_name in services.api.crud.Secrets().list_secret_tokens(self._auth_info.username): | ||
| # if self._validate_token_name(token_name): | ||
| # run.spec.auth["token_name"] = token_name | ||
| # return | ||
|
|
||
| # raise ValueError( | ||
| # "No valid authentication token found. " | ||
| # "Please create a valid offline token or provide one explicitly." | ||
| # ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it
liranbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very well. minor suggestion
| run.spec.auth["token_name"] = "default" | ||
|
|
||
| def _validate_token_name(self, token_name: str, explicit: bool = False): | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leave the TODO here, above this function
| ], | ||
| 1, | ||
| ), | ||
| # Volume with a different name already exists (should add new one) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both volumes has the the same name "other-volume", so this comment is wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One is the volumes, the other is the volume mounts.
The volume that the function add is called secret
| # Volume with a different name already exists (should add new one) | ||
| ( | ||
| [{"mountPath": "/some/other/path", "name": "other-volume"}], | ||
| [{"name": "other-volume", "other-volume": {"items": []}}], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this valid Kubernetes schema ?
| runtime = mlrun.runtimes.kubejob.KubejobRuntime() | ||
| # # Runtime that inherit from BaseRuntime already have existing volumes/mounts | ||
| # runtime.spec.volume_mounts = [] | ||
| # runtime.spec.volumes = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment is a bit misleading, you say that runtime already have existing volumes, but then expect them to be empty ?
📝 Description
This PR updates the MLRun API to mount the Kubernetes secret corresponding to IG4’s offline token
default(for the running user) as a file inside the run pod at a predefined folder owned by MLRun. The folder path is exposed to the pod via an environment variable so that the SDK running inside the pod can access the token file.The
ServerSideLauncher._enrich_runtimemethod saves the token name inrun.spec.auth. TheKubejobRuntimeHandlerthen uses this token name to determine the Kubernetes secret name and mounts it on the runtime before pod creation.As a result,
volumesandvolume_mountsare added to both the runtime and pod specifications.The
MLRUN_AUTH_OFFLINE_TOKENenv var that was used in #8840 is removed and not needed anymore.🛠️ Changes Made
MLRUN_AUTH_OFFLINE_TOKENand setMLRUN_AUTH_WITH_OAUTH_TOKEN__TOKEN_FILEduring auth env enrichment for podsauthfield toRunSpecand toMLClientCtx(needed for it to be saved in DB)KubejobRuntimeHandler.run()✅ Checklist
🧪 Testing
🔗 References
🚨 Breaking Changes?
🔍️ Additional Notes
In future PR's, token other than
defaultwill be allowed and resolved