-
Notifications
You must be signed in to change notification settings - Fork 293
[Kaniko] Add S3/MinIO Build Context Support and Improve Builder Environment Handling #8849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
99dfa98 to
ca30161
Compare
- Enable S3 build context in Kaniko:
- Add _enrich_kaniko_env_for_s3_context() to auto-set AWS_REGION=us-east-1
and S3_FORCE_PATH_STYLE=true when using s3://
- Pass builder env vars to Kaniko pod and convert AWS creds to secrets
- K8s pod refactor:
- Rename BasePod → Pod
- Maintain explicit env list and add add_env_var()
- Implement replace_env_vars_with_secrets() and _create_secret_for_env_var()
to materialize sensitive env vars as K8s secrets
- Builder / build flow improvements:
- Accept env from request or function spec (`builder_env` fallback to `spec.env`)
- Normalize builder env handling and logging via mlrun.utils.helpers.logger
- Call function.try_auto_mount_based_on_config() before deploy
- Treat s3://, v3io://, http(s):// sources distinctly; mount/copy as needed
- Generate project secret-backed envs without overriding provided vars
https://iguazio.atlassian.net/browse/CEML-520
908a04f to
494998f
Compare
e330044 to
47893ac
Compare
liranbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR incorporate few changes at once
- Fix for s3 region/force path
- Env to secrets
- Access key handling
This PR needs to be broken into smaller prs, each has its own risks and need to be carefully taken care of. for the (2) I am not sure I udnerstand the reason behind it. for (3) it seems to be incomplete and also, why is it needed? and for (1) - I think the safest way to proceed is to check if we can already can overcome the issue by providing explicit kaniko build args.
| def add_env_var(self, env_var: client.V1EnvVar): | ||
| self.env.append(env_var) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the operation is not idempotent and prone to duplicate envvars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither is add_volume, why do we need it to be idempotent?
| secret_env_var = self._create_secret_for_env_var( | ||
| name=env_var.name, | ||
| value=env_var.value, | ||
| k8s_helper=k8s_helper, | ||
| ) | ||
| self.env[index] = secret_env_var |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what component is responsible to remove those secret later on? if I have 100 envvars, will it create 100 secrets? what's the reasoning behind it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exactly the same as we do in mlrun_pipelines.common.ops._replace_env_vars_with_secrets, which was added in this PR you approved, where we create obfuscate secret env vars for each pipeline run.
The reasoning is that we do not want AWS/V3IO credentials to be visible on pod YAML's.
| secret_name, _ = k8s_helper.store_auth_secret( | ||
| username=name, | ||
| access_key=value, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does that aligned with igz4/oauth/igz3 expectations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is the same as mlrun_pipelines.common.ops._replace_env_vars_with_secrets
| name="AWS_REGION", | ||
| value="us-east-1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why defaulting to this region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because kaniko expects there to be a region set, even for minio, and this is the default AWS region.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because kaniko expects there to be a region set, even for minio, and this is the default AWS region.
not 100%, on aws - admin can set its default region. I suggest to not default ourselves by either get it using aws (perhpas using aws command on init container / mlrun configuration) but def not hardcoded
@liranbg |
| name="AWS_REGION", | ||
| value="us-east-1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because kaniko expects there to be a region set, even for minio, and this is the default AWS region.
not 100%, on aws - admin can set its default region. I suggest to not default ourselves by either get it using aws (perhpas using aws command on init container / mlrun configuration) but def not hardcoded
|
|
||
| # MinIO requires path-style URLs | ||
| already_has_path_style = any( | ||
| env.name == "S3_FORCE_PATH_STYLE" and env.value for env in env_vars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not 100% safe for "new" aws regions not supporting this ENV/configuration as they require virtual-host path. maybe we can force this configuration when we deploy against minio (mlrun ce-like configuration) - but not when working natively against s3
liranbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking better, worth adding some ut use cases coverage to cover conventional flows
…er Environment Handling" (#9052) Reverts #8849 Due to 1. https://iguazio.atlassian.net/browse/ML-11743 2. https://iguazio.atlassian.net/browse/ML-11731 Overall, there are few issues with this PR as it enriches kaniko pod with s3 information and ignoring project secrets. PR needs to be rewritten and intensively covered.
📝 Description
Extend Kaniko build support to handle S3 build-context sources and improve how builder environment variables are generated.
Also standardize logging calls in builder flows and ensure Kubernetes pod environments are always initialized as typed lists.
🛠️ Changes Made
Kubernetes Helper
Pod.envas a typed list (list[V1EnvVar]) instead ofNoneto ensure safe and consistent mutation.Builder
is_s3_sourceflag.s3://similarly to local/v3io contexts._enrich_kaniko_env_for_s3_context():s3://, setAWS_REGIONusing:"default-region".S3_FORCE_PATH_STYLE=truefor MinIO compatibility.mlrun.utils.logger.*calls withmlrun.utils.helpers.logger.*for consistent structured logging._generate_builder_env():[{name, value}]) instead of a dict.V1EnvVarobjects.🧪 Testing
s3://→ correct region inference and MinIO path-style injection.v3io://, local paths, andhttp://→ unchanged behavior.🔗 References
🚨 Breaking Changes?
🔍 Additional Notes