sql: fix the pg text representation of tuples/arrays#28151
sql: fix the pg text representation of tuples/arrays#28151craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
d9cef74 to
ca2ec25
Compare
|
Now with tests. I checked that all the combinations format the same as postgres. |
|
welp I spoke too fast. Fixing. |
7c2ecbb to
211795d
Compare
|
There we go. Actually once I understood how this works it's rather regular/simple. |
|
RFAL |
| if i > 0 { | ||
| sep := "" | ||
| // TODO(justin): add a test for nested arrays. | ||
| for _, d := range v.Array { |
There was a problem hiding this comment.
might as well just do if i > 0 { WriteByte(' ') } ?
pkg/sql/sem/tree/format.go
Outdated
| // Used internally in combination with FmtArrays defined below. | ||
| fmtWithinArray | ||
| // Used internally in combination with FmtPgwireText defined below. | ||
| pgwireFormat |
There was a problem hiding this comment.
I'm not especially convinced that this shouldn't be its own method or function (on Datum) at this point, rather than a flag on Format, but I don't feel particularly strongly.
BramGruneir
left a comment
There was a problem hiding this comment.
Looks good overall, but some comments.
Reviewed 13 of 13 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained
pkg/sql/logictest/testdata/logic_test/tuple, line 712 at r1 (raw file):
---- r ("(""(1,""""2"""",3)"",""(4,""""5"""")"",""(6)"")","(7,8)","(""9"")")
This is so ugly. :(
pkg/sql/sem/tree/datum.go, line 413 at r1 (raw file):
func (d *DBool) Format(ctx *FmtCtx) { if ctx.HasFlags(pgwireFormat) { v := byte('f')
Why store this value?
if bool(*d) {
ctx.WriteByte('t')
} else {
ctx.WriteByte('f')
}
return
pkg/sql/sem/tree/datum.go, line 2734 at r1 (raw file):
ctx.WriteByte('(') comma := "" for _, v := range d.D {
Not sure if this is better, but it reads a bit nicer I think. Also this way you don't set the comma variable each time.
for i, v := range d.D {
if i > 1 {
ctx.WriteString(", ")
}
ctx.FormatNode(v)
}
I really want a bool for first and last and I'm kinda surprised that they were not included in golang's range operator.
pkg/sql/sem/tree/datum.go, line 3026 at r1 (raw file):
} ctx.WriteString("ARRAY[")
Here too,
pkg/sql/sem/tree/pgwire_encode.go, line 37 at r1 (raw file):
// string printer called pgwireFormatStringInTuple(). ctx.WriteByte('(') comma := ""
See my other comments, I'd rather a check on the iteration count than setting comma on each pass. For all of the commas in this file.
pkg/sql/sem/tree/pgwire_encode.go, line 57 at r1 (raw file):
func pgwireFormatStringInTuple(buf *bytes.Buffer, in string) { // TODO(knz): to be fully pg-compliant, this function should avoid
For all of these, this seems pretty easy to do. Is there a reason to not just use a different buffer then copy from one to the other with or without the double quotes?
pkg/sql/sem/tree/pgwire_encode.go, line 111 at r1 (raw file):
func pgwireFormatStringInArray(buf *bytes.Buffer, in string) { // TODO(knz): to be fully pg-compliant, this function should avoid
same here
pkg/util/stringencoding/string_encoding.go, line 86 at r1 (raw file):
// EncodeChar is used internally to write out a character from // a larger string to a buffer. func EncodeChar(buf *bytes.Buffer, entireString string, currentRune rune, currentIdx int) {
This function looks like it could use a unit test.
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/logictest/testdata/logic_test/tuple, line 712 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
This is so ugly. :(
It is ugly but it is also correct. See my other comment below.
pkg/sql/pgwire/types.go, line 165 at r1 (raw file):
Previously, justinj (Justin Jaffray) wrote…
might as well just do
if i > 0 { WriteByte(' ') }?
This way is actually faster / more efficient. Vivek taught me this trick. I'm using it as a general-purpose pattern now.
pkg/sql/sem/tree/datum.go, line 413 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
Why store this value?
if bool(*d) { ctx.WriteByte('t') } else { ctx.WriteByte('f') } return
I don't care either way. Changed.
pkg/sql/sem/tree/datum.go, line 2734 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
Not sure if this is better, but it reads a bit nicer I think. Also this way you don't set the comma variable each time.
for i, v := range d.D { if i > 1 { ctx.WriteString(", ") } ctx.FormatNode(v) }I really want a bool for first and last and I'm kinda surprised that they were not included in golang's range operator.
This way is actually faster / more efficient. Vivek taught me this trick. I'm using it as a general-purpose pattern now.
pkg/sql/sem/tree/datum.go, line 3026 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
Here too,
This way is actually faster / more efficient. Vivek taught me this trick. I'm using it as a general-purpose pattern now.
pkg/sql/sem/tree/format.go, line 87 at r1 (raw file):
Previously, justinj (Justin Jaffray) wrote…
I'm not especially convinced that this shouldn't be its own method or function (on
Datum) at this point, rather than a flag onFormat, but I don't feel particularly strongly.
This combination still allows up to combine other flags (e.g. anonymize).
pkg/sql/sem/tree/pgwire_encode.go, line 37 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
See my other comments, I'd rather a check on the iteration count than setting comma on each pass. For all of the commas in this file.
ditto my other comment
pkg/sql/sem/tree/pgwire_encode.go, line 57 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
For all of these, this seems pretty easy to do. Is there a reason to not just use a different buffer then copy from one to the other with or without the double quotes?
Because performance.
@justinj has decided to do it this way originally and reasoned as follows: it is correct to add the double quotes when they are not required. Removing them is only "niceness" to the human user. I'm not revisiting this here -- we can make a separate PR for that.
pkg/sql/sem/tree/pgwire_encode.go, line 111 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
same here
ditto
pkg/util/stringencoding/string_encoding.go, line 86 at r1 (raw file):
Previously, BramGruneir (Bram Gruneir) wrote…
This function looks like it could use a unit test.
None of these functions are tested directly in this package. Their unit tests live in format_test.go in pkg/sql/sem/tree and pkg/sql/lex.
211795d to
7071d03
Compare
This patch fixes the conversion of arrays, tuples and bools to text for the purpose of emitting the values back to a client through pgwire with the text protocol. This now properly supports nested arrays, arrays containing tuples, tuples containing arrays, etc. The boolean values get serialized properly to 't' and 'f' like in PostgreSQL. Release note (bug fix): CockroachDB supports a wider range of tuple and array values in query results.
7071d03 to
200d2a6
Compare
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/logictest/testdata/logic_test/srfs, line 543 at r2 (raw file):
SELECT ((i.keys).*, 123) FROM (SELECT information_schema._pg_expandarray(ARRAY[3,2,1]) AS keys) AS i ---- ((3, 1),123)
Jordan just noticed there is a bug here, but unrelated to this PR. Filed as #28203
|
PTAL with Justin's offline comments in mind? |
|
I'll hold off this while Bram prepares his all-eng prez. |
|
Bram gave a 👍 offline! thanks bram |
|
bors r+ |
28151: sql: fix the pg text representation of tuples/arrays r=knz a=knz Forked off #28143. Fixes #25522. This patch fixes the conversion of arrays and tuples to text for the purpose of emitting the arrays/tuples back to a client through pgwire. This now properly supports nested arrays, arrays containing tuples, tuples containing arrays, etc. Labels in tuples are never emitted through pgwire. Release note (bug fix): CockroachDB supports a wider range of tuple and array values in query results. Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Build succeeded |
Forked off #28143.
Fixes #25522.
This patch fixes the conversion of arrays and tuples to text for the
purpose of emitting the arrays/tuples back to a client through pgwire.
This now properly supports nested arrays, arrays containing tuples,
tuples containing arrays, etc. Labels in tuples are never emitted
through pgwire.
Release note (bug fix): CockroachDB supports a wider range of tuple
and array values in query results.