cli: support COCKROACH_REDACTION_POLICY_MANAGED env var#86475
cli: support COCKROACH_REDACTION_POLICY_MANAGED env var#86475craig[bot] merged 1 commit intocockroachdb:masterfrom
COCKROACH_REDACTION_POLICY_MANAGED env var#86475Conversation
7445944 to
b6b2a76
Compare
knz
left a comment
There was a problem hiding this comment.
Reviewed 1 of 15 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @abarganier and @logston)
pkg/base/config.go line 163 at r2 (raw file):
// of a managed service (e.g. CockroachCloud). Impacts log // redaction policies. Managed bool
Not here. Add the field to cliCtx in pkg/cli/context.go.
pkg/base/config.go line 270 at r2 (raw file):
func (cfg *Config) InitDefaults() { cfg.Insecure = defaultInsecure cfg.Managed = defaultManaged
Do this in setCliConextDefaults, pkg/cli/context.go.
pkg/cli/context.go line 454 at r2 (raw file):
serverCertPrincipalMap []string serverListenAddr string serverManaged bool
no need for this.
pkg/cli/flags.go line 476 at r2 (raw file):
// NB: Insecure is deprecated. See #53404. cliflagcfg.BoolFlag(f, &startCtx.serverInsecure, cliflags.ServerInsecure) cliflagcfg.BoolFlag(f, &startCtx.serverManaged, cliflags.ServerManaged)
cliCtx.managed
pkg/cli/flags.go line 894 at r2 (raw file):
// NB: Insecure is deprecated. See #53404. cliflagcfg.BoolFlag(f, &startCtx.serverInsecure, cliflags.ServerInsecure) cliflagcfg.BoolFlag(f, &startCtx.serverManaged, cliflags.ServerManaged)
ditto
pkg/cli/flags.go line 994 at r2 (raw file):
serverCfg.User = username.NodeUserName() serverCfg.Insecure = startCtx.serverInsecure serverCfg.Managed = startCtx.serverManaged
no need for this
pkg/util/log/redact.go line 223 at r2 (raw file):
// and does not need to be treated as sensitive. This function marks // the provided object as safe/unsafe accordingly. func SafeManaged(s interface{}) redact.RedactableString {
Nah, this is not the right way to go about this. Let's get away from global variables. Try this instead:
- here:
func SafeManaged(a interface{}) interface{} {
return safeManaged{a}
}
type safeManaged struct {a interface{}}- in
makeUnstructuredEntry, taking inspiration from #82891, something like that:
for i, a := range args {
if e, ok := a.(safeManaged); ok && logger.managed {
args[i] = e.a
}
}(you might need to copy the args slice if there's a safeManaged in it, instead of overwriting it in-place, i'm not sure)
this will likely need to make makeUnstructuredEntry a method of *loggerT, but I think that's OK.
The flip side is that it opens the door to choosing the redaction per logger (i.e. per channel) too.
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @abarganier and @logston)
pkg/util/log/redact.go line 223 at r2 (raw file):
safeManaged
btw that new safeManaged type will need a Format and/or SafeFormat method too.
|
I encourage you to study how the |
b6b2a76 to
ebe5eea
Compare
abarganier
left a comment
There was a problem hiding this comment.
Thanks for the feedback - RFAL
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz and @logston)
pkg/base/config.go line 163 at r2 (raw file):
Previously, knz (kena) wrote…
Not here. Add the field to
cliCtxin pkg/cli/context.go.
Done.
pkg/base/config.go line 270 at r2 (raw file):
Previously, knz (kena) wrote…
Do this in
setCliConextDefaults, pkg/cli/context.go.
Done.
pkg/cli/context.go line 454 at r2 (raw file):
Previously, knz (kena) wrote…
no need for this.
Done.
pkg/cli/flags.go line 476 at r2 (raw file):
Previously, knz (kena) wrote…
cliCtx.managed
Done.
pkg/cli/flags.go line 894 at r2 (raw file):
Previously, knz (kena) wrote…
ditto
Done.
pkg/cli/flags.go line 994 at r2 (raw file):
Previously, knz (kena) wrote…
no need for this
Done.
pkg/util/log/redact.go line 223 at r2 (raw file):
Previously, knz (kena) wrote…
safeManaged
btw that new
safeManagedtype will need a Format and/or SafeFormat method too.
Thanks for this advice - the separate safeManaged type implementing redact.SafeFormatter is a much better approach.
I gave modifying makeUnstructuredEntry a try and defining it on the *loggerT itself. Things were going great, until some testing at the end revealed that handling the conditional unwrapping depending on logger.managed within makeUnstructuredEntry doesn't cover cases such as this one where RedactableBytes() is called on a redact.StringBuilder.
Instead, we need to know the state of the managed flag at the time SafeManaged was called, or at the time SafeFormat is called. Otherwise, my interpretation is that we won't have full coverage. Let me know if I'm missing something.
As a result, I chose to store a copy of the state of the managed flag at call time for SafeManaged (but I don't feel strongly about it over checking isManaged() when executing SafeFormat). I've added a managed bool to the new type, which we can reference within SafeFormat to make the determination there on whether to mark safe or not. This covers cases such as RedactableBytes nicely.
I realize this still leaves us depending on the package-global logging variable 🤔. As an alternative, I chose to at least define setManaged and isManaged on the loggingT itself. I didn't see a good way to include it as part of the loggerT itself, but I'm open to alternatives - let me know any thoughts you might have.
ebe5eea to
3f8f5b5
Compare
There was a problem hiding this comment.
I did some thinking about this over the weekend. To optimize for meeting our requirements before we cut for 22.2, I'd like to keep the implementation here simple where we choose to wrap in safeManaged/leave as-is at the time we call SafeManaged.
Looking forward, I am feeling the need for a more holistic solution to these "conditional redaction" cases. I've created #87038 to track this work & come up with a broader solution that aligns with our future goals of multiple logging configurations per-process.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz and @logston)
pkg/util/log/redact.go line 223 at r2 (raw file):
As a result, I chose to store a copy of the state of the managed flag at call time for SafeManaged (but I don't feel strongly about it over checking isManaged() when executing SafeFormat). I've added a managed bool to the new type, which we can reference within SafeFormat to make the determination there on whether to mark safe or not. This covers cases such as RedactableBytes nicely.
After some sleep & advice from the team, I realized that storing the state of managed within safeManaged is completely unnecessary. We'll just check it at the time SafeManaged is called & make the decision to wrap/leave as-is then.
f42a094 to
838aa0a
Compare
andreimatei
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @knz, and @logston)
pkg/cli/cliflags/flags.go line 729 at r4 (raw file):
} ServerManaged = FlagInfo{
instead of a flag, would it be better to make this part of the logconfig.Config ?
pkg/cli/cliflags/flags.go line 733 at r4 (raw file):
EnvVar: "COCKROACH_MANAGED", Description: ` Indicate that the node being started is being run as part of a managed service.
s/being started/
pkg/util/log/redact.go line 226 at r4 (raw file):
// NB: If the argument itself implements the redact.SafeFormatter interface, // then we delegate to its implementation in either case. func SafeManaged(a interface{}) interface{} {
consider leaving a link to #87038 in a comment
pkg/util/log/redact.go line 230 at r4 (raw file):
return safeManaged{a}
instead of this, could we do return redact.Safe(a) and get rid of safeManaged ? Wouldn't that be simpler?
I think the safeManaged struct would be useful if the decision to the argument or not was deferred to logging time, but given the current structure I don't understand what it's for.
838aa0a to
c2e3bcd
Compare
abarganier
left a comment
There was a problem hiding this comment.
TFTR, RFAL
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @logston)
pkg/cli/cliflags/flags.go line 729 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
instead of a flag, would it be better to make this part of the
logconfig.Config?
I gave this some thought and I'd like to keep it as a flag. I could see this flag being used in the future for purposes beyond just logging, where we want to alter behavior elsewhere if we're running in a managed service.
pkg/cli/cliflags/flags.go line 733 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
s/being started/
Done.
pkg/util/log/redact.go line 226 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
consider leaving a link to #87038 in a comment
Done.
pkg/util/log/redact.go line 230 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
return safeManaged{a}
instead of this, could we do
return redact.Safe(a)and get rid ofsafeManaged? Wouldn't that be simpler?
I think thesafeManagedstruct would be useful if the decision to the argument or not was deferred to logging time, but given the current structure I don't understand what it's for.
You're right - we've essentially come full circle on this 😅 . Switched it to the simpler approach.
andreimatei
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @abarganier, @andreimatei, @knz, and @logston)
pkg/cli/cliflags/flags.go line 729 at r4 (raw file):
Previously, abarganier (Alex Barganier) wrote…
I gave this some thought and I'd like to keep it as a flag. I could see this flag being used in the future for purposes beyond just logging, where we want to alter behavior elsewhere if we're running in a managed service.
I don't very clearly see how "running as a managed service" is tied to the logging changes being made here. It seems to me that it's not the fact that it's a managed service that makes all the things you're unredacting safe, but that it's CRL who runs this managed service and we don't intend to give anyone else debug.zip's produced by it. So perhaps consider --crl-managed as the flag name. I generally feel that we haven't dissected enough what it is that makes what we're doing here safe, so I'd still vote we relegate this setting to a more narrowly-scoped log config option. But whatever you think.
pkg/cli/cliflags/flags.go line 733 at r5 (raw file):
EnvVar: "COCKROACH_MANAGED", Description: ` Indicate that the node is being run as part of a managed service/ This primarily
there's a slash at the end of the first sentence
abarganier
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @abarganier, @andreimatei, @knz, and @logston)
pkg/cli/cliflags/flags.go line 729 at r4 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I don't very clearly see how "running as a managed service" is tied to the logging changes being made here. It seems to me that it's not the fact that it's a managed service that makes all the things you're unredacting safe, but that it's CRL who runs this managed service and we don't intend to give anyone else
debug.zip's produced by it. So perhaps consider--crl-managedas the flag name. I generally feel that we haven't dissected enough what it is that makes what we're doing here safe, so I'd still vote we relegate this setting to a more narrowly-scoped log config option. But whatever you think.
I think you make a good point - I'll prepare a patch to move this into the logging config
c2e3bcd to
c5a9f9d
Compare
abarganier
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @logston)
pkg/cli/cliflags/flags.go line 729 at r4 (raw file):
Previously, abarganier (Alex Barganier) wrote…
I think you make a good point - I'll prepare a patch to move this into the logging config
After further consideration, our time table to get this delivered as part of the larger PCI redaction project is too tight to rework the implementation at this point. I'm going to leave as-is.
pkg/cli/cliflags/flags.go line 733 at r5 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
there's a slash at the end of the first sentence
Done.
knz
left a comment
There was a problem hiding this comment.
Overall OK, but I'm 👎 on adding a CLI flag at all. IMHO this should be exclusively an env var. See rationale below.
Reviewed 8 of 15 files at r2, 12 of 14 files at r3, 1 of 2 files at r4, 1 of 2 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @andreimatei, @knz, and @logston)
pkg/cli/cliflags/flags.go line 730 at r6 (raw file):
ServerManaged = FlagInfo{ Name: "managed",
I think a flag with this word structure --managed is not sufficiently descriptive.
Can we make it --keep-operational-details-during-redaction? or perhaps --log-redact-policy=managed
But really I am also wondering if this needs a documented command-line flag at all. I think the env var should be sufficient? Especially for CC deployments, adding a new flag means that the operator team needs to detect the version of crdb before they decide to add the flag.
With an env var, they can set it for every version without causing errors and the new version will pick it up automatically.
abarganier
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @logston)
pkg/cli/cliflags/flags.go line 730 at r6 (raw file):
Previously, knz (kena) wrote…
I think a flag with this word structure
--managedis not sufficiently descriptive.
Can we make it--keep-operational-details-during-redaction? or perhaps--log-redact-policy=managedBut really I am also wondering if this needs a documented command-line flag at all. I think the env var should be sufficient? Especially for CC deployments, adding a new flag means that the operator team needs to detect the version of crdb before they decide to add the flag.
With an env var, they can set it for every version without causing errors and the new version will pick it up automatically.
That SGTM, especially since this is primarily meant to only be used by CRL. Thanks for the guidance here - I'll prepare a patch to remove the flag & solely rely on the env var instead.
c5a9f9d to
f2ffb93
Compare
abarganier
left a comment
There was a problem hiding this comment.
I've removed the flag in favor of an environment variable, and adjusted the naming to be more descriptive.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @logston)
--managed flag, log.SafeManaged redaction function & usageCOCKROACH_REDACTION_POLICY_MANAGED env var
knz
left a comment
There was a problem hiding this comment.
I think this is missing unit tests.
Reviewed 11 of 12 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @andreimatei, @knz, and @logston)
pkg/cli/flags.go line 526 at r7 (raw file):
} cliCtx.redactionPolicyManaged = envutil.EnvOrDefaultBool(redactionPolicyManagedEnvVar, false)
Not quite. I'm expecting this will crash.
Also, it won't properly reset in-between unit tests.
I recommend you move this logic in context.go (same directory), in setCliContextDefaults().
Currently, log redaction policies have no way to discern their own
runtime environment. Logged objects that may be considered sensitive
and unsafe in on-prem deployments of CockroachDB might be otherwise
safe when we're running within a managed service such as Cockroach
Cloud. For example, CLI argument lists included as part of the
`cockroach start` command are already known to those operating the
managed service, so there's no reason we should be redacting this
information from logs in this case.
This patch adds the `COCKROACH_REDACTION_POLICY_MANAGED` env var to
be handled in the various the start commands. This flag is plumbed
through to the global logging config object where the log package has
access to it.
We also introduce `log.SafeManaged(s interface{})`, which conditionally
marks an object with `redact.Safe()` depending on whether or not we
are running as a managed service. This is a successor to the original
`log.SafeOperational(s interface{})` function.
I believe that this new env var should not be advertised in
our public documentation, as its intended use is for those running
Cockroach Cloud.
This patch also implements a handful of these redaction tweaks.
While it's not meant to be exhaustive, it supplies a good
baseline example of how `log.SafeManaged()` is used, and
provides an immediate reduction of redaction when it comes
to critical information for support staff, such as CLI args,
network addresses, and more.
Additionally, a few log lines have been identified as being
redacted when it was not necessary. This patch also makes a
few spot improvements using `redact.Safe` as well in such
cases.
We intentionally avoid a release note here, as we aim to keep
the usage of this environment variable internal.
Release justification: low-risk, high benefit changes to existing
functionality. The new CLI flag has a minimal impact on DB
operations and provides high value reduction of log redaction,
which will be necessary for support staff with our latest compliance
requirements.
Release note: none
f2ffb93 to
4640c58
Compare
abarganier
left a comment
There was a problem hiding this comment.
Added unit tests and reduced invasiveness on pkg/cli. We now contain the handling of the env var within pkg/util/log.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @logston)
pkg/cli/flags.go line 526 at r7 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
Not quite. I'm expecting this will crash.
Also, it won't properly reset in-between unit tests.I recommend you move this logic in
context.go(same directory), insetCliContextDefaults().
I didn't experience any crash during manual testing, but nonetheless, if we're not using a CLI flag anymore then it doesn't feel like we need to invade on this package.
I've instead opted to move all of this into log.ApplyConfig. We can read the env variable there, allowing tests that touch logging (e.g. use TestingResetActive() and log.ApplyConfig()) to reset in-between unit tests.
knz
left a comment
There was a problem hiding this comment.
Reviewed 8 of 8 files at r8, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei, @knz, and @logston)
|
bors r=knz |
|
Build succeeded: |
Currently, log redaction policies have no way to discern their own
runtime environment. Logged objects that may be considered sensitive
and unsafe in on-prem deployments of CockroachDB might be otherwise
safe when we're running within a managed service such as Cockroach
Cloud. For example, CLI argument lists included as part of the
cockroach startcommand are already known to those operating themanaged service, so there's no reason we should be redacting this
information from logs in this case.
This patch adds the
--managedflag to the start commands. Thisflag is plumbed through to the global logging config object where
the log package has access to it.
We also introduce
log.SafeManaged(s interface{}), which conditionallymarks an object with
redact.Safe()depending on whether or not weare running as a managed service. This is inspired by the original
log.SafeOperational(s interface{})function.I believe that this new
--managedflag should not be advertised inour public documentation, as its intended use is for those running
Cockroach Cloud.
Release justification: low-risk, high benefit changes to existing
functionality. The new CLI flag has a minimal impact on DB
operations and provides high value reduction of log redaction,
which will be necessary for support staff with our latest compliance
requirements.
Release note (cli change):
cockroach startcommands now have anadditional
--managedflag that can be used to indicate whetheror not the node is running as part of a managed service (e.g.
Cockroach Cloud). Perhaps this shouldn't be advertised in our
public facing docs, as its only intended for use by those running
Cockroach Cloud and not for on-prem deployments.
Addresses #86316