Conversation
Signed-off-by: Gustavo Carvalho <gustavo.carvalho@container-solutions.com>
Signed-off-by: Gustavo Carvalho <gustavo.carvalho@container-solutions.com>
design/001-secretsink.md
Outdated
| secretStoreRefs: | ||
| - name: secret-store | ||
| kind: SecretStore | ||
| type: Source # Can be Source or Sink | ||
| - name: second-secret-store | ||
| kind: SecretStore #or ClusterSecretStore | ||
| type: Sink #There can only be 0..1 Source, or we make the first entry the Source. |
There was a problem hiding this comment.
If we are allowed to define both Source and Sink we can not apply the very same data and dataFrom mapping logic in both directions:
I understand data and dataFrom conceptually as a mapping function that takes a list of secrets (from the provider) and produces a map[string][]byte as output which is used to construct a secret.
func fn(input map[string][]byte) (output map[string][]byte)With the below ExternalSecret we would have to re-use fn and apply it both ways (which is very inflexible):
fn(provider-data) -> k8s-secret
fn(k8s-secret) -> sink
We would need a different mapping function fn2 or create distinct ExternalSecrets for Sinks/Sources.
There was a problem hiding this comment.
My idea was, looking at the code base we have right now, both getSecretData and getSecretMap actually returns a []byte that is used by the controller to create the map[string][]byte, meaning the logic handle of the signature is defined at the reconciler level. In order for this approach to work, we would already have to change reconciler code because now we would have multiple SecretStores.
I think the fn you're refering is this (where the input is actually obtained inside from the externalSecret object):
func (r *Reconciler) getProviderSecretData(ctx context.Context, providerClient provider.SecretsClient, externalSecret *esv1alpha1.ExternalSecret) (map[string][]byte, error)
My take when creating the spec would indeed be a fn2 in this fashion:
func (r *Reconciler) setProviderSecretData(ctx context.Context, providerClient provider.SecretsClient, , externalSecret *esv1alpha1.ExternalSecret) (error)
the input map[string][ ]byte would come from the ExternalSecret data itself (we use the data from externalSecret to secrets().get() the APIServer, or use the v1Secret directly instead of fetching from the ES.
In order for that to work, we would need to change the getStore method to be a slice of Stores at least (if not an even more complicated structure), and do some interaction before deciding if we should go for the getProviderSecretData or the setProviderSecretData
In any way, this change as I've imagined would imply in big controller changes, which is something we might not be up to in the first place.
This is the first draft more to stimulate community folks interested in #347 to take a look and speak up.
There was a problem hiding this comment.
I'm not yet talking about the implementation, just conceptually.
What i mean is: (let's take a complete example)
We have a store provider-1 with
| key | value |
|---|---|
| foo | one |
| bar | two |
And a user wants a Kind=Secret like this:
Kind: Secret
data:
my_foo: "one" # let's ignore b64 for now
my_bar: "two"The user has to define a ExternalSecret as
Kind: ExternalSecret
data:
- secretKey: my_foo
remoteRef:
key: foo
- secretKey: my_bar
remoteRef:
key: barMapping the secret keys
The data array acts as a mapping function that translates foo to my_foo for the direction Source -> Kubernetes.
But how do we map the secret keys to remote provider keys? Would that then be via data and dataFrom? If so, then we very likely can not have the same translation function for the transformation of values Source -> Kubernetes and Kubernetes -> Sink. Either we have two mapping functions or we define just one Store: Either Source or Sink.
If i understand you correctly you would use data and dataFrom as transformation functions when pushing secrets from Kubernetes to a provider. Is that correct?
I would like to propose the following changes:
(1) Let's create a different custom resource for this use-case. Not only for separation of concerns (ExternalSecrets = Pull / ExternalSink = Push), complexity but also for semantical reasons. data and dataFrom doesn't really speak for itself - what it does and how it works. I think it makes sense to start with a clean slate for this particular feature and let's not "make it work" with the existing CRD.
(2) I really like the idea of re-using SecretStore and specifying multiple SecretStores in a SecretSink 👍 - if we just focus on the push part then we can omit the type=Source|Sink field.
(3) i would like to be explicit: we just allow selecting one secret per ExternalSink. Why? So we don't run into naming collisions (secret keys can be defined multiple times when working with multiple secrets) and the doesn't need to take care of resolving them. Also, there's a risk of race condition (unstable sorting) when working with multiple resources at once.
# Select one secret explicitly by its name because we want to avoid collisions:
# Many secrets may have the same key defined.
selector:
# have the option to add other types later (ConfigMap, ...?)
secret:
# we have the option to use other kinds of matching
name: my-secretFor the mapping the keys i propose two options
(1) Explicit matching
As a user i would want to push individual keys to a provider. I want to define a mapping between the secret keys and the remote provider keys.
match:
- secretKey: game.properties
# (!!!) remoteKey defaults to the secretKey.
# It can be omitted if a user wants them to be identical at the provider
remoteKey: /dev/game.propertiesThat would give the user a shorthand for pushing keys to the remote provider like this:
match:
- secretKey: tls.crt
- secretKey: tls.key
- secretKey: ca.crt(2) Regexp / capture Groups
Explicit matching is tedious and can be improved by using
regexp capture groups (or named groups) which makes the matching more generic.
An alternative to this would be to use go templates, but i don't think it would be a good fit because we just work with short, single-line strings.
rewrite:
- secretKey: game.(.+)
remoteKey: /dev/$1Bringing it together
Full example
apiVersion: external-secrets.io/v1alpha1
Kind: SecretSink
spec:
secretStoreRefs:
- name: aws-primary
kind: SecretStore
- name: gcp-failover
kind: SecretStore
selector:
secret:
name: something
match:
- secretKey: game.properties
remoteKey: /dev/game.properties
rewrite:
- secretKey: game.(.+)
remoteKey: /dev/$1Some more examples:
apiVersion: external-secrets.io/v1alpha1
Kind: SecretSink
spec:
# [...]
# explicitly define which keys should be pushed to the remote
match:
- secretKey: tls.crt # remoteKey=secretKey if omitted
- secretKey: tls.key
- secretKey: ca.crt
---
apiVersion: external-secrets.io/v1alpha1
Kind: SecretSink
spec:
# [...]
# same as above but uses rewrite for brevity
rewrite:
- secretKey: (.+) # matches every key
remoteKey: $1
---
apiVersion: external-secrets.io/v1alpha1
Kind: SecretSink
spec:
# [...]
# adding a prefix
rewrite:
- secretKey: (.+)
remoteKey: /prefix/prod/$1There was a problem hiding this comment.
I understand your reasoning.. indeed I was thinking of using the same key names from data and dataFrom and I can see why this can be undesirable (for instance, to sync specific secrets to provider namespaces in a pattern/ns/key). It gets even worse when each provider has its own validating regexp for the keys, so a key in AWS SSM might need to be rewritten into Azure KV simply because it rejects a '/'. And yes, if we cannot sink to different stores than the original, we might as well just close this draft as it would add no significant value :P
So yes, I aggree we really should have the ability to use different pattern than dataFrom and data.
I still have one concern on actually separating ExternalSecrets from SecretSink: how would we avoid a deadlock (e.g. an External Secret feeding a Secret being used by a SecretSink, refeeding the same ExternalSecret - and then a single change on whichever part might cause two secrets to keep being re-updated onto themselves). In a single CRD this would be prevented by the specification wether the Store is a Source or a Sink...
The only way I can think of to solve this is the overkill (3,4,5) operation:
1 get the SecretSink
2 get the Secret
3 check if the Secret name matches any ExternalSecrets definition (I believe the Ownership is not set if creationPolicy is Merge so we cannot use it)
4 check that ExternalSecrets SecretStore to see if it's the same Store/secret as the one being referred in the SecretSink
5) Block if it is
Do you all imagine a better approach for this coming issue?
There was a problem hiding this comment.
(1) Do you all imagine a better approach for this coming issue?
I see your problem. It comes down to the fact that we're having two async processes that fight with each other. We have to synchronize the processes in order to resolve that race condition:
E.g. by acquiring a lock and defining a policy on how to resolve it (aka who is leading: Kubernetes or the remote provider)."acquiring a lock" could be a simple in-memory map[key]*State.
// key is the namespace/name of a secret
type key string
type State struct {
mu *sync.Mutex
policy esv1alpha1.LeaderPolicy
writer esv1alpha1.ExternalSecret
reader []esv1alpha1.ExternalSink
}Whenever a controller reconciles a resource - whether it be ExternalSecret or SecretSink - it would have to acquire the lock and take action based on a policy: push-before-pull or vice-versa. How do we know which ExternalSecret to reconcile? Either we store a mapping in-memory or attach a label/status field to the Secret/ExternalSecret/SecretSink.
When we boot the controller we have to make sure that we populate the stateMap with the current state of the world first before we reconcile the resources.
(2) Do you all imagine a better approach for this coming issue?
Embedding the sink in the ExternalSecret would resolve that issue, too.
In either case we still need (1) a policy to define who's leading and (2) a mapping function for the push direction.
My gut feeling tells me we should go with two separate resources, but i see a significant increase in complexity and yet-to-be discovered failure modes 😵💫
There was a problem hiding this comment.
Either way, things needed for (2) are also needed for (1). I do think we can start with different CRDs and if the development gets way too complex, we can merge the CRDs together and use part of the methods and structure for (2).
I'll change the draft appropriately.
There was a problem hiding this comment.
After thinking about it, I don't see why someone would want to define the Sink as a priority over what's on an ExternalSecrets content (in the sense, if the ExternalSecrets do exist to creage/merge a Secret, the source of truth could never be that Secret), and probably the Sink referencing to the same SecretStore as the ExternalSecrets is a mistake from the operator - and yes, nevertheless of the mistake, we still need to take it into account.
Using this premise, what we can do is:
(3):
We can use the labels on the Secret to find the appropriate SecretStore (ES labeling external-secrets.io/secret-store: store or external-secrets.io/cluster-secret-store: store) it comes from. (both O(1) operations on ES reconcile and secretSink reconcile)
When a reconcile on any SecretSink happens, we can then (at least in the beginning of this feature) not allow syncing back to the same store.
Signed-off-by: Gustavo Carvalho <gustavo.carvalho@container-solutions.com>
Signed-off-by: Gustavo Carvalho <gustavo.carvalho@container-solutions.com>
…eRef Signed-off-by: Gustavo Carvalho <gustavo.carvalho@container-solutions.com>
Co-authored-by: Moritz Johner <moolen@users.noreply.github.com>
|
Kudos, SonarCloud Quality Gate passed! |
moolen
left a comment
There was a problem hiding this comment.
LGTM. I think this is ready and can be implemented.
|
As discussed in the community meeting today, this is accepted and ready to be implemented. |








This is a Draft proposal for the CRD change for #347.
Note: after typing the original idea of a separate CRD, it really feels awkward to have so many duplicate fields to add the capability to sync back the secret. Taking that into account, this proposal has small 'surgical' changes to our current CRDs to allow this feature. Although most of them are optional, some of them cause breaking changes.
This is not even close from the final draft, as we Might need to think more on the specifics of it all.
Signed-off-by: Gustavo Carvalho gustavo.carvalho@container-solutions.com