Apply peer authentication policy#20829
Conversation
incfly
left a comment
There was a problem hiding this comment.
i have two high level questions.
what's the rough idea of supporting auto mTLS? wire up the policy resolution with EDS code? every policy change is a full push to ensure the correctness?
when no port mtls is specified, what's the behavior, secure everything including pass through listeners? when port specified, only per port listener is secured? what if the port defined in the policy is not part of the existing workload's inbound listener, aka, :? is that considered as mis-configuration?
There was a problem hiding this comment.
is this a valid use case? label selector && defined in root ns? just want to confirm.
There was a problem hiding this comment.
any prior usage of this pattern? req authn || authz?
There was a problem hiding this comment.
For the first question, yes (for now), as we don't have the way to prevent selector used in root namespace yet (need some change in validation webhook to know what is the root namespace, and it may not be useful given root namespace can be changed).
For the second, both request authn and authz are appendable, so they don't need this logic to disambiguation.
There was a problem hiding this comment.
This is not a valid use case. WE dont respect workload specific config in the root namespace for sidecar or envoy filter. So lets follow the pattern for consistency and not create exceptions
There was a problem hiding this comment.
Do you mean we can ignore them? (but not reject at validation time)
Right. May be we can change BuildLbEndpointMetadata to take into account the policy for that given workload.
This is 3 questions :), so yes, yes (assuming the parent level is set to disable) and depends. For the last one, I think we can either ignore it (base on assumption traffic to that port will get 503 anyway), or still create an inbound listener for that port with those setting. Not sure which one is the right answer yet, though I'm a bit leaning forward the later. |
|
/test integ-security-k8s-tests_istio |
… take into account the policy
…remove old mesh policy during installation
There was a problem hiding this comment.
would prefer abstracting to a fuction
There was a problem hiding this comment.
Lets please not use the old logic. This is not only complex but also error prone. All we need is a simple function in the push context PeerAuthenticationPolicyForNamespace(ns). This will return either the global one or the namespace local one. If there is no global one, we assume permissive, just like the way we assume a default sidecar. We can reuse logic similar to the sidecar/envoy filter etc.
And we should NOT check the workload specific authN policy - the cluster can belong to a subset or to an entire service. As we agreed previously, if the user is sophisticated enough to use a per workload policy to disable peer authn, then she should also define destination rules, etc. to create the exception. This model will keep the implementation simple and robust, and most importantly scalable! We cannot be doing label based search for every possible service, for every sidecar in the system.
There was a problem hiding this comment.
I agree. I implemented it this way so that we have something in case we don't have time to work on the better option. Anyway, I revert this now, and we will have a separate PR tackle the auto-mtls alone.
rshriram
left a comment
There was a problem hiding this comment.
Please change the logic in cluster.go. We cannot be querying by service labels and we cannot be computing the appropriate peer authn for every workload. All of this needs to be pre-computed per namespace and we should only be doing lookups. The first fix should be in push context, where you initialize peer authN. Also, we need to have a built in default peer authN if there is no global peer authN. This will ensure that all namespaces will always inherit the global one.
There was a problem hiding this comment.
Lets please not use the old logic. This is not only complex but also error prone. All we need is a simple function in the push context PeerAuthenticationPolicyForNamespace(ns). This will return either the global one or the namespace local one. If there is no global one, we assume permissive, just like the way we assume a default sidecar. We can reuse logic similar to the sidecar/envoy filter etc.
And we should NOT check the workload specific authN policy - the cluster can belong to a subset or to an entire service. As we agreed previously, if the user is sophisticated enough to use a per workload policy to disable peer authn, then she should also define destination rules, etc. to create the exception. This model will keep the implementation simple and robust, and most importantly scalable! We cannot be doing label based search for every possible service, for every sidecar in the system.
| return model.MTLSPermissive | ||
| } | ||
| switch peer.Mtls.Mode { | ||
| case v1beta1.PeerAuthentication_MutualTLS_DISABLE: |
There was a problem hiding this comment.
and what about the unset thing?
There was a problem hiding this comment.
Good point. Now it will be more complicated :)
There was a problem hiding this comment.
It is actually simple if you use the same logic as the Sidecar
There was a problem hiding this comment.
This is not a valid use case. WE dont respect workload specific config in the root namespace for sidecar or envoy filter. So lets follow the pattern for consistency and not create exceptions
| level += 2 | ||
| } | ||
| if isStrictlyStronger(spec, configByScope[level]) { | ||
| configByScope[level] = cfg |
There was a problem hiding this comment.
This logic is memory intensive, allocating an array and destroying it for each query for each proxy in the system. It will also undergo a lot of changes when you introduce workload specific sidecars. To me, a better way is to reuse the logic that builds https://github.com/istio/istio/blob/master/pilot/pkg/model/push_context.go#L95 sidecarsByNamespace . And then as commented above, you simply have to query push context for the namespace's peer authN policy. Otherwise, this will become a performance bottleneck once again.
There was a problem hiding this comment.
We already use that. This is a second round to pick from the short list (the input list is a list of config that match the workload) based on our precedent order rules.
There was a problem hiding this comment.
You dont need the list of configs - that is my point. Take a look at the code in push context. It simply scans through a hash map and picks the appropriate config without allocating more memory. If there is no namespace level config, it automatically falls back to the global config which always exists even if the user did not define a global setting.
There was a problem hiding this comment.
There are few things that the approach with sidecar doesn't work here:
- We want to take into account the 'strength' of mTLS mode to break the tie.
- We need to 'inherit' from parent for UNSET mode (thanks to your other comment :) )
I've updated this PR to handle UNSET, but I'm still working on simplify it a bit more, using the assumption that there is at most one namespace & mesh level policy (we already enforce that from validation), and also ignore policies in root namespace but have selector (as you suggested)
There was a problem hiding this comment.
This is the pointer of array, []*model.Config, not the []model.Config, will that still be mem intensive concern?
There was a problem hiding this comment.
I re-implement this function to take into account UNSET (which should be inherit from parent), and port-level settings (which need to be aggregated to the final policy, including resolve for UNSET). It seems more complicated now, unfortunately, though I think it has the correct behavior.
PTAL
…ch also consolidate port-level policies.
| // Review ports with UNSET mode and update the final workload level mTLS if it is stronger. | ||
| for port := range unsetPorts { | ||
| existing, exist := finalPolicy.PortLevelMtls[port] | ||
| if !exist || getMutualTLSMode(finalPolicy.Mtls) > getMutualTLSMode(existing) { |
There was a problem hiding this comment.
same here. might require changing the signature of isStritlyStrong to pass MutualTls rather than policy
| }, | ||
| }, | ||
| { | ||
| ConfigFile: "beta-mtls-on.yaml", |
There was a problem hiding this comment.
would it help to have test cases where we have both beta and alpha policy? that can be a typical case during migration.
and what's the intended behavior? to me, it seems like if there's one beta policy instance (ns/mesh/workload) can be applied to a particular workload, all alpha policy is ignored, is that correct?
feel free to do it in follow-up PR and leave a TODO.
There was a problem hiding this comment.
which one? I didn't see it in the e2e test. all config file only references the beta policy.
| finalPolicy.PortLevelMtls[port] = mtls | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
You have to break out of the main for loop once you get a workload specific policy. Thsi will be the oldest policy. We cannot pick the strongest as it has the potential to break workloads that dont even belong to the team that wrote this policy.
| processedJwtRules: processedJwtRules, | ||
| alphaApplier: alpha_applier.NewPolicyApplier(policy), | ||
| jwtPolicies: jwtPolicies, | ||
| peerPolices: peerPolicies, |
There was a problem hiding this comment.
do you need peerpolicies? seems to me you only need to store consolidatePeerPolicy in the applier.
|
Oh. I mean unit test. I'll add one later for e2e in different PR.
…On Thu, Feb 6, 2020 at 11:30 AM Jianfei Hu ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In tests/integration/security/reachability_test.go
<#20829 (comment)>:
> @@ -91,6 +91,45 @@ func TestReachability(t *testing.T) {
return opts.PortName != "http"
},
},
+ {
+ ConfigFile: "beta-mtls-on.yaml",
which one? I didn't see it in the e2e test. all config file only
references the beta policy.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20829?email_source=notifications&email_token=AF7X24PDDSU4TU5ZS2QJAG3RBRQM3A5CNFSM4KPOV2LKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCUSIBEA#discussion_r376037061>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF7X24NNCFEDFPEXLXER7QDRBRQM3ANCNFSM4KPOV2LA>
.
--
Diem Vu | Software Engineer | diemvu@google.com | +1 408-215-8127
|
|
I think we may need this for an analyzer tool.
…On Thu, Feb 6, 2020 at 11:38 AM Jianfei Hu ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pilot/pkg/security/authn/v1beta1/policy_applier.go
<#20829 (comment)>:
> @@ -146,9 +169,12 @@ func NewPolicyApplier(jwtPolicies []*model.Config, policy *authn_alpha_api.Polic
})
return &v1beta1PolicyApplier{
- jwtPolicies: jwtPolicies,
- processedJwtRules: processedJwtRules,
- alphaApplier: alpha_applier.NewPolicyApplier(policy),
+ jwtPolicies: jwtPolicies,
+ peerPolices: peerPolicies,
do you need peerpolicies? seems to me you only need to store
consolidatePeerPolicy in the applier.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20829?email_source=notifications&email_token=AF7X24JO7W7C2D74MXJN3U3RBRRMJA5CNFSM4KPOV2LKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCUSJKXI#pullrequestreview-354719069>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF7X24JVQA76A6OIUWUOMWTRBRRMJANCNFSM4KPOV2LA>
.
--
Diem Vu | Software Engineer | diemvu@google.com | +1 408-215-8127
|
| }, | ||
| want: nil, | ||
| }, | ||
| { |
There was a problem hiding this comment.
there's a less common case: define port level settings in mesh/namespace policy, and see how that's inherited && combined with workload level policy. shall we add case for that?
There was a problem hiding this comment.
no. It can be rejected by validation.
incfly
left a comment
There was a problem hiding this comment.
i left one more question on the behavior of port level settings defined in the mesh/ns policy. approving now.
| func composePeerAuthentication(rootNamespace string, configs []*model.Config) *v1beta1.PeerAuthentication { | ||
| var meshPolicy, namespacePolicy *v1beta1.PeerAuthentication | ||
| var meshPolicy, namespacePolicy, workloadPolicy *v1beta1.PeerAuthentication | ||
| // Creation time associate with the selected workloadPolicy above. Intiial to max time (not set) |
| if namespacePolicy != nil && !isMtlsModeUnset(namespacePolicy.Mtls) { | ||
| // If namespace policy is defined, update output policy to namespace policy. This means namespace | ||
| // policy overwrite mesh policy. | ||
| outputPolicy.Mtls = namespacePolicy.Mtls |
There was a problem hiding this comment.
wait, are we abandoning inherit stronger from namespace/mesh policy as well? i thought we only remove inherit for workload level policy.
There was a problem hiding this comment.
No, it's never before. There is at most 1 namespace level, so no conflict. Same as mesh. Narrower scope always win.
| @@ -343,111 +344,71 @@ func getMutualTLSMode(mtls *v1beta1.PeerAuthentication_MutualTLS) model.MutualTL | |||
| // replaced with config from workload-level, UNSET in workload-level config will be replaced with | |||
There was a problem hiding this comment.
i didn't see these section updates. i expect there' will be some.
There was a problem hiding this comment.
Why this is need to be updated? It is just a conversion from user facing enum to our internal implementation so we can maintain the same semantic between different API version.
There was a problem hiding this comment.
if there are more than 1 policy define
// port-level mTLS for the same port, the stronger one is used.
this is no longer true. we use the entire workload policy ordered by timestamp now.
incfly
left a comment
There was a problem hiding this comment.
istio/api doc shall have some updates as well i assume?
We didn't say much about conflict resolution in the API :). |
|
/test e2e-mixer-no_auth_istio |
|
In response to a cherrypick label: #20829 failed to apply on top of branch "release-1.5": |
* Apply beta peer authentication policy down to workload level * Clean up * Lint * Check beta policy for auto mtls. This can be removed when EP metadata take into account the policy * Use explicit peerauthentication policy for permissive, as we haven't remove old mesh policy during installation * pilot/pkg/security/authn/v1beta1/policy_applier.go * Move all test for beta mTLS api to the end * Change to namespace policy * Revert cluster.go * Change peer authn consolidation algorithm for UNSET (inheritant mode) * Reimplement getMostSpecificConfig (now composePeerAuthentication) which also consolidate port-level policies. * Fix inheritance: do not inherit if it is weaker than the current mode * Remove debug logs * Change test policy to namespace level to make sure they are clean up properly with the existing test setup. * Address comment * Lint * Simplify logic to pick the oldest * fix typo * Update function comment
* Apply beta peer authentication policy down to workload level * Clean up * Lint * Check beta policy for auto mtls. This can be removed when EP metadata take into account the policy * Use explicit peerauthentication policy for permissive, as we haven't remove old mesh policy during installation * pilot/pkg/security/authn/v1beta1/policy_applier.go * Move all test for beta mTLS api to the end * Change to namespace policy * Revert cluster.go * Change peer authn consolidation algorithm for UNSET (inheritant mode) * Reimplement getMostSpecificConfig (now composePeerAuthentication) which also consolidate port-level policies. * Fix inheritance: do not inherit if it is weaker than the current mode * Remove debug logs * Change test policy to namespace level to make sure they are clean up properly with the existing test setup. * Address comment * Lint * Simplify logic to pick the oldest * fix typo * Update function comment
* Apply beta peer authentication policy down to workload level * Clean up * Lint * Check beta policy for auto mtls. This can be removed when EP metadata take into account the policy * Use explicit peerauthentication policy for permissive, as we haven't remove old mesh policy during installation * pilot/pkg/security/authn/v1beta1/policy_applier.go * Move all test for beta mTLS api to the end * Change to namespace policy * Revert cluster.go * Change peer authn consolidation algorithm for UNSET (inheritant mode) * Reimplement getMostSpecificConfig (now composePeerAuthentication) which also consolidate port-level policies. * Fix inheritance: do not inherit if it is weaker than the current mode * Remove debug logs * Change test policy to namespace level to make sure they are clean up properly with the existing test setup. * Address comment * Lint * Simplify logic to pick the oldest * fix typo * Update function comment
Implement logic to select peer authentication policy for workload: most specific scope, then most secure mode (i.e STRICT > PERMISSIVE > DISABLE).
Implement logic to apply peer authentication policy at workload level and above (namespace, mesh), and fallback to (alpha) authentication policy.
Port-level support and better auto mTLS will be taken care in the following up PRs.
Issue: #20746