cli: command to deserialize descriptors#52972
Conversation
knz
left a comment
There was a problem hiding this comment.
The code looks good! But this will need a test before it merges.
Also from a UX perspective, I don't think that copy-pasting long proto encodings on the command line is the way to go.
First of all, the typical UX will be a bunch of codes inside a file (e.g. a dump of system.descriptors in TSV format)
also some shells do not like argument lists that are longer than X characters.
So I think I would find it more convenient to use if the tool were to read strings from its standard input.
Ideally, it would scan through standard input, reading one string of input at a time (with any kind of junk in-between), and spit out the decoded value on standard output. This way I could redirect the entire TSV dump of system.descriptors to it and it would ignore the columns that don't contain proto-encoded data automatically.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @knz)
spaskob
left a comment
There was a problem hiding this comment.
Yes this makes sense - PTAL and let me know if this is what you were looking for.
After confirmation I will add the tests.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
knz
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/cli/debug.go, line 479 at r1 (raw file):
Args: cobra.ArbitraryArgs, RunE: func(cmd *cobra.Command, args []string) error { sc := bufio.NewScanner(os.Stdin)
Recommendations:
-
pull this entire function into a separate funcon
runDebugDecodeProto -
then also pull the scanner/scan logic into a separate function that takes an
io.Readeras argument -- this will facilitate testing -
finally pull the single code decoding logic also into a separate function
So you'd have:
func runDebugDecodeProto(_ *cobra.Command, _ []string) error {
return streamMap(os.Stdout, os.Stdin,
func (s string) (bool, string, error) { return tryDecodeValue(s, debugDecodeProtoName) })
}
// streamMap applies `fn` to all the scanned fields in `in`, and reports
// the result of `fn` on `out`.
// Errors returned by `fn` are emitted on `out` with a "warning" prefix.
func streamMap(out os.Writer, in os.Reader, fn func(string) (bool, string, error)) error {
for sc := NewScanner(); sc.Scan(); {
for _, field := range strings.Fields(sc.Text()) {
ok, value, err := fn(field, debugDecodeProtoName)
if err != nil {
fmt.Fprintln(out, "warning: %v", err)
} else if ok {
fmt.Fprintln(out, value)
}
}
}
}
// tryDecodeValue tries to decode the given string with the given proto name
// reports ok=false if the data was not valid proto-encoded.
func tryDecodeValue(s, protoName string) (ok bool, val string, err error) {
bytes, err := gohex.DecodeString(s)
if err != nil {
b, err := base64.StdEncoding.DecodeString(field)
if err != nil {
return false, "", nil
}
bytes = b
}
msg, err := protoreflect.DecodeMessage(protoName, bytes)
if err != nil {
return false, "", nil
}
j, err := protoreflect.MessageToJSON(msg)
if err != nil {
// Unexpected error: the data was valid protobuf, but does not
// reflect back to JSON. We report the protobuf struct in the
// error message nonetheless.
return false, "", errors.Wrapf(err, "while JSON-encoding %#v", msg)
}
return true, j.String(), nil
}Then you can have separate unit tests for streamMap and tryDecodeValue.
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/cli/debug.go, line 478 at r1 (raw file):
`, Args: cobra.ArbitraryArgs, RunE: func(cmd *cobra.Command, args []string) error {
Also, separately, a tip: when a command-line utility reads from stdin, and a user doesn't know this upfront, and they run it mistakenly interactively from their terminal, then it looks like the command is "hanging" - the user wants "something" to happen but nothing happens.
The idiom to solve this UX pitfall is to inform the user that the command is not interactive when they expect it to be. This information message is only useful when a human user can see it, so there is no need to print it if the command is not run interactively. You can check this with isatty.IsTerminal() (which we already use in other places. For example:
if isatty.IsTerminal(os.Stdin.Fd()) {
fmt.Fprintln(stderr, "# Reading proto-encoded pieces of data from stdin.\n# Press Ctrl+C or Ctrl+D to terminate.")
}
spaskob
left a comment
There was a problem hiding this comment.
PTAL - refactor and tests done
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @knz)
pkg/cli/debug.go, line 478 at r1 (raw file):
Previously, knz (kena) wrote…
Also, separately, a tip: when a command-line utility reads from stdin, and a user doesn't know this upfront, and they run it mistakenly interactively from their terminal, then it looks like the command is "hanging" - the user wants "something" to happen but nothing happens.
The idiom to solve this UX pitfall is to inform the user that the command is not interactive when they expect it to be. This information message is only useful when a human user can see it, so there is no need to print it if the command is not run interactively. You can check this with
isatty.IsTerminal()(which we already use in other places. For example:if isatty.IsTerminal(os.Stdin.Fd()) { fmt.Fprintln(stderr, "# Reading proto-encoded pieces of data from stdin.\n# Press Ctrl+C or Ctrl+D to terminate.") }
Done.
pkg/cli/debug.go, line 479 at r1 (raw file):
Previously, knz (kena) wrote…
Recommendations:
pull this entire function into a separate funcon
runDebugDecodeProtothen also pull the scanner/scan logic into a separate function that takes an
io.Readeras argument -- this will facilitate testingfinally pull the single code decoding logic also into a separate function
So you'd have:
func runDebugDecodeProto(_ *cobra.Command, _ []string) error { return streamMap(os.Stdout, os.Stdin, func (s string) (bool, string, error) { return tryDecodeValue(s, debugDecodeProtoName) }) } // streamMap applies `fn` to all the scanned fields in `in`, and reports // the result of `fn` on `out`. // Errors returned by `fn` are emitted on `out` with a "warning" prefix. func streamMap(out os.Writer, in os.Reader, fn func(string) (bool, string, error)) error { for sc := NewScanner(); sc.Scan(); { for _, field := range strings.Fields(sc.Text()) { ok, value, err := fn(field, debugDecodeProtoName) if err != nil { fmt.Fprintln(out, "warning: %v", err) } else if ok { fmt.Fprintln(out, value) } } } } // tryDecodeValue tries to decode the given string with the given proto name // reports ok=false if the data was not valid proto-encoded. func tryDecodeValue(s, protoName string) (ok bool, val string, err error) { bytes, err := gohex.DecodeString(s) if err != nil { b, err := base64.StdEncoding.DecodeString(field) if err != nil { return false, "", nil } bytes = b } msg, err := protoreflect.DecodeMessage(protoName, bytes) if err != nil { return false, "", nil } j, err := protoreflect.MessageToJSON(msg) if err != nil { // Unexpected error: the data was valid protobuf, but does not // reflect back to JSON. We report the protobuf struct in the // error message nonetheless. return false, "", errors.Wrapf(err, "while JSON-encoding %#v", msg) } return true, j.String(), nil }Then you can have separate unit tests for
streamMapandtryDecodeValue.
Done.
knz
left a comment
There was a problem hiding this comment.
a nit, and also the linter is unhappy:
[11:10:28]W: [Step 6/6] --- a/pkg/cli/decode_test.go
[11:10:28]W: [Step 6/6] +++ b/pkg/cli/decode_test.go
[11:10:28]W: [Step 6/6] @@ -9,12 +9,14 @@ import (
[11:10:28]W: [Step 6/6]
[11:10:28]W: [Step 6/6] "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb"
[11:10:28]W: [Step 6/6] "github.com/cockroachdb/cockroach/pkg/sql/protoreflect"
[11:10:28]W: [Step 6/6] + "github.com/cockroachdb/cockroach/pkg/util/leaktest"
[11:10:28]W: [Step 6/6] "github.com/cockroachdb/cockroach/pkg/util/protoutil"
[11:10:28]W: [Step 6/6] "github.com/cockroachdb/errors"
[11:10:28]W: [Step 6/6] "github.com/stretchr/testify/require"
[11:10:28]W: [Step 6/6] )
[11:10:28]W: [Step 6/6]
[11:10:28]W: [Step 6/6] func TestStreamMap(t *testing.T) {
[11:10:28]W: [Step 6/6] + defer leaktest.AfterTest(t)()
[11:10:28]W: [Step 6/6] tests := []struct {
[11:10:28]W: [Step 6/6] name string
[11:10:28]W: [Step 6/6] in string
[11:10:28]W: [Step 6/6] @@ -58,6 +60,7 @@ func TestStreamMap(t *testing.T) {
[11:10:28]W: [Step 6/6] }
[11:10:28]W: [Step 6/6]
[11:10:28]W: [Step 6/6] func TestTryDecodeValue(t *testing.T) {
[11:10:28]W: [Step 6/6] + defer leaktest.AfterTest(t)()
[11:10:28]W: [Step 6/6] protoName := "cockroach.sql.sqlbase.TableDescriptor"
[11:10:28]W: [Step 6/6] marshal := func(pb protoutil.Message) []byte {
[11:10:28]W: [Step 6/6] s, err := protoutil.Marshal(pb)Reviewed 3 of 3 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/cli/decode_test.go, line 49 at r2 (raw file):
for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { out := &bytes.Buffer{}
nit: var out bytes.Buffer is more idiomatic
pkg/cli/decode_test.go, line 50 at r2 (raw file):
t.Run(tt.name, func(t *testing.T) { out := &bytes.Buffer{} err := streamMap(out, strings.NewReader(tt.in), tt.fn)
and then pass this via &out
1299e5e to
3d9eb2d
Compare
|
also you need to update the release note with the new name of the flag |
e6ea48b to
7076c8d
Compare
We currently export the descriptors and job payloads in debug zips with hex encoding. It is often very valuable to decode these protos into JSON for inspection. Fixes cockroachdb#52063. Release note (cli change): The new debug command `decode-proto` reads descriptor from stdin in hex or base64 format (auto-detected) and a flag --schema=<fully qualified name to decode> with default value `cockroach.sql.sqlbase.Descriptor` and outputs to stdout the deserialized proto in JSON format. If the input is not a hex/base64 encoded proto then it is outputted verbatim.
|
bors r+ |
|
Build failed (retrying...): |
|
Build succeeded: |
We currently export the descriptors and job payloads in debug zips with
hex encoding. It is often very valuable to decode these protos into
JSON for inspection.
Fixes #52063.
Release note (cli change):
The new debug command
decode-protoreads descriptor from stdin in hexor base64 format (auto-detected) and a flag --schema=<fully qualified name to decode>
with default value
cockroach.sql.sqlbase.Descriptorand outputs to stdout thedeserialized proto in JSON format.