sql: add partial redactability to logs#66359
Conversation
|
can you perhaps rebase your branch on top of |
knz
left a comment
There was a problem hiding this comment.
Another thing we can do before we discuss:
can you outline in the PR description (in prose) a high-level summary of what you did, and why? That will help clarify your understanding and ensure the convo doesn't make mistaken assumptions.
Reviewable status:
complete! 0 of 0 LGTMs obtained
a854963 to
3868842
Compare
3868842 to
3b16e16
Compare
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @THardy98)
pkg/sql/event_log.go, line 181 at r1 (raw file):
user: p.User(), //stmt: tree.AsStringWithFQNames(p.stmt.AST, p.extendedEvalCtx.EvalContext.Annotations), stmt: string(tree.AsRedactableStringWithFQNames(p.stmt.AST, p.extendedEvalCtx.EvalContext.Annotations)),
since the name of the function already indicates that it will return a string, you don't need the extra string() here
pkg/sql/sem/tree/format.go, line 439 at r1 (raw file):
ctx := NewFmtCtxEx(FmtAlwaysQualifyTableNames|FmtRedactNode, ann) ctx.FormatNode(n) return ctx.CloseAndGetRedactableString()
you can add the content of the CloseAndGetRedactableString() directly here, no need to create a new function just for that since you're not using this part of code anywhere
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @THardy98)
pkg/util/log/redact.go, line 40 at r1 (raw file):
// part(s) of the SQL statement, in contrast with withRedaction, which redacts // the entirety of the SQL statement withPartialRedaction = 8
We should keep using the flag withRedaction, instead of creating a new one and then fix the behaviours on that one, since I don't think hiding the whole statement is useful anyway
|
Hint: this work should not require any changes to the |
Ah, so changes should have been made for |
what kind of changes? |
None in |
|
Please use more words with examples :) |
When Currently, it would receive something like: But with changes to |
|
yes that sounds reasonable. |
|
@knz do you have a suggestions to enforcing no redaction on virtual schemas/tables? I noted in the code how I can't use |
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @THardy98)
pkg/sql/sem/tree/format.go, line 496 at r1 (raw file):
func (ctx *FmtCtx) formatNodeMaybeRedact(n NodeFormatter) { if ctx.flags.HasFlags(FmtRedactNode) { switch v := n.(type) {
This is not the right location for this.
Instead of this method,
modify hideNonVirtualTableNameFunc in sql/exec_util.go
and conditionally replace the formatting flags using (*FmtCtx).WithFlags() inside the reformat function implemented there.
0ed6a0f to
7967603
Compare
7967603 to
271b23b
Compare
66431: eventpb,sql: lay some foundation for redactable statement strings r=maryliag,dhartunian a=knz Informs #65401. Needed as prerequisite for #66359. Prior to this change, the centralized structured logging logic and the query logging mechanism were both assuming that the entirety of the statement string was unsafe for reporting. We want in a later change to introduce some nuance into this. However, the structured event logging encoding did not know anything else than strings that were either completely-safe or completely-unsafe. This patch introduces the possibility for a structured event to contain a string of go type `redact.RedactableString`, where the caller has already taken responsibility for placing redaction markers. Release note: None 66518: builtins: implement ST_HasArc r=rafiss a=otan Release note (sql change): Implement the ST_HasArc function. This adds better out-of-the-box support for Geoserver. Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net> Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
b95e1a0 to
49ae063
Compare
knz
left a comment
There was a problem hiding this comment.
Still LGTM, but you also still need to iron out your grep for CI.
Reviewed 4 of 11 files at r6, 1 of 3 files at r8, 5 of 7 files at r11, 2 of 4 files at r12, 1 of 1 files at r13, 1 of 1 files at r14.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @maryliag and @THardy98)
pkg/cli/interactive_tests/test_exec_log.tcl, line 103 at r14 (raw file):
# succeeds. system "for i in `seq 1 3`; do grep 'SELECT .+660.+' $logfile && exit 0;
- the
+sign is part of extended regular expressions. Basicgrepdoesn't support it. You probably want to use..*instead or try your luck withgrep -E - also I don't think that you just want to match "anything" after the number. Obviously, in a log file the data will be enclosed in double quotes and other punctuation. I think you want this match to say "anything BUT some other word or punctuation that I'm otherwise also expecting after the number". In this case, the thing that's expected afterwards is a space followed by a plus sign.
same comments apply below.
pkg/ccl/importccl/import_csv_mark_redaction_test.go, line 22 at r14 (raw file):
// TestMarkRedactionCCLStatement verifies that CCL statements are marked // for redaction correctly.
they are not "marked for redaction" - that phrasing would suggest that they will necessarily be redacted at some point
I think you mean "... verifies that the redactable parts of CCL statements are marked correctly."
pkg/sql/statement_mark_redaction_test.go, line 27 at r14 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
nit: period
I also think you need to rephrase the comment, see my previous comment on this above.
7fbcc7c to
da22982
Compare
THardy98
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @knz, @maryliag, and @THardy98)
pkg/cli/interactive_tests/test_exec_log.tcl, line 103 at r14 (raw file):
Previously, knz (kena) wrote…
- the
+sign is part of extended regular expressions. Basicgrepdoesn't support it. You probably want to use..*instead or try your luck withgrep -E- also I don't think that you just want to match "anything" after the number. Obviously, in a log file the data will be enclosed in double quotes and other punctuation. I think you want this match to say "anything BUT some other word or punctuation that I'm otherwise also expecting after the number". In this case, the thing that's expected afterwards is a space followed by a plus sign.
same comments apply below.
Modified per your suggestion, but a last look would be nice.
pkg/sql/exec_util.go, line 1631 at r14 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
nit: remove blank line between the comment and the code it's referring to
Done.
pkg/ccl/importccl/import_csv_mark_redaction_test.go, line 22 at r12 (raw file):
Previously, Azhng (Archer Zhang) wrote…
nit: period
Done.
pkg/ccl/importccl/import_csv_mark_redaction_test.go, line 22 at r14 (raw file):
Previously, knz (kena) wrote…
they are not "marked for redaction" - that phrasing would suggest that they will necessarily be redacted at some point
I think you mean "... verifies that the redactable parts of CCL statements are marked correctly."
Done.
pkg/sql/statement_mark_redaction_test.go, line 27 at r14 (raw file):
Previously, knz (kena) wrote…
I also think you need to rephrase the comment, see my previous comment on this above.
Done.
|
bors r+ |
|
Build failed (retrying...): |
|
bors r- 🚨 bors is busted, doing a reset 🚨 |
|
Canceled. |
|
bors r=knz |
|
This PR was included in a batch that was canceled, it will be automatically retried |
|
Build failed (retrying...): |
|
bors r- ^ that last failure seems related to this PR. may need a rebase. |
|
Canceled. |
da22982 to
b624e29
Compare
|
bors r+ |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
bors r- |
|
Canceled. |
Prior to this change, enabling log redaction redacted entire SQL statements. Consequently, troubleshooting efforts were hindered by the limited SQL statement information available in the redacted logs. This change introduces partial redaction to log SQL statements, wherein only sensitive information is redacted. As a result, more information is available to those using redacted logs for troubleshooting. Changes were introduced to the statement formatter. The formatter now checks where the redaction flag is set. If so, redaction markers are placed are specific node types in the statement's AST (namely, the Datum, Constant, Name, and Unrestricted Name types). Virtual schemas/tables are not redacted. The `public` schema is also not redacted. Resolves: cockroachdb#65401 Release note (bug fix): Added partial redactability to log SQL statements. This change provides greater visibility to SQL usage in the logs, enabling greater ability to troubleshoot.
|
bors r+ |
|
Build succeeded: |
Prior to this change, enabling log redaction redacted entire SQL
statements. Consequently, troubleshooting efforts were hindered by the
limited SQL statement information available in the redacted logs. This
change introduces partial redaction to log SQL statements, wherein only
sensitive information is redacted. As a result, more information is
available to those using redacted logs for troubleshooting.
Changes were introduced to the statement formatter. The formatter now
checks where the redaction flag is set. If so, redaction markers are
placed are specific node types in the statement's AST (namely, the
Datum, Constant, Name, and Unrestricted Name types).
Virtual schemas/tables are not redacted. The
publicschema is also notredacted.
Note: redaction markers are ‹ and ›. Terms between the redaction markers are redacted.
Redaction Marker Examples:
Redaction Examples:
Resolves: #65401
Release note (bug fix): Added partial redactability to log SQL
statements. This change provides greater visibility to SQL usage in the
logs, enabling greater ability to troubleshoot.