docs/tech-notes: adding a note about JSONB key encoding#99849
docs/tech-notes: adding a note about JSONB key encoding#99849craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
8ab0fae to
286a36d
Compare
rytaft
left a comment
There was a problem hiding this comment.
Nice work!
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @Shivs11)
docs/tech-notes/jsonb_forward_indexing.md line 4 at r1 (raw file):
This document intends to build on the JSONB value encoding RFC
Add a link to the RFC
Currently, it is only possible to value-encode JSONB values in CRDB....
Since you're implementing key encoding right now, I would change this paragraph to say that the key encoding does exist, and this doc explains how it works.
docs/tech-notes/jsonb_forward_indexing.md line 8 at r1 (raw file):
The motivation and use-cases for including JSONB were included in the scoping RFC of JSONB.
Add a link
Currently, it is only possible to value-encode...
Same comment as above
docs/tech-notes/jsonb_forward_indexing.md line 32 at r1 (raw file):
4. Moreover, arrays with equal number of elements are compared in the order: element - 1, element - 2, element - 3, ….
nit: this notation is a bit confusing, since it looks like you are subtracting 1 from element. Perhaps element1, element2, element3, ... would be clearer?
docs/tech-notes/jsonb_forward_indexing.md line 35 at r1 (raw file):
5. Objects with an equal number of key value pairs are compared in the order: key - 1, value - 1, key - 2, value - 2, ….
same comment here
docs/tech-notes/jsonb_forward_indexing.md line 37 at r1 (raw file):
Moreover, care has to be taken to ensure that none of these bytes correspond to a possible encoding of a JSON value.
I thought we decided this wasn't a concern?
docs/tech-notes/jsonb_forward_indexing.md line 100 at r1 (raw file):
JSONB_Array, enc(1), enc(1), JSONB_Terminator,
Are you missing JSONB_Number in between the two enc(1)? And a second JSONB_Terminator at the end of the array?
JSONB_String, enc(b),
Are you missing a JSONB_Terminator here?
286a36d to
595fa45
Compare
Shivs11
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
docs/tech-notes/jsonb_forward_indexing.md line 4 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
This document intends to build on the JSONB value encoding RFC
Add a link to the RFC
Currently, it is only possible to value-encode JSONB values in CRDB....
Since you're implementing key encoding right now, I would change this paragraph to say that the key encoding does exist, and this doc explains how it works.
Done!
docs/tech-notes/jsonb_forward_indexing.md line 8 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
The motivation and use-cases for including JSONB were included in the scoping RFC of JSONB.
Add a link
Currently, it is only possible to value-encode...
Same comment as above
Done!
docs/tech-notes/jsonb_forward_indexing.md line 32 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: this notation is a bit confusing, since it looks like you are subtracting 1 from element. Perhaps
element1, element2, element3, ...would be clearer?
I agree. Changes have been made.
docs/tech-notes/jsonb_forward_indexing.md line 35 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
same comment here
Done.
docs/tech-notes/jsonb_forward_indexing.md line 37 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Moreover, care has to be taken to ensure that none of these bytes correspond to a possible encoding of a JSON value.
I thought we decided this wasn't a concern?
You're right. The well defined structure of the encoding could resolve this issue. Removing this.
docs/tech-notes/jsonb_forward_indexing.md line 100 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
JSONB_Array, enc(1), enc(1), JSONB_Terminator,Are you missing
JSONB_Numberin between the two enc(1)? And a secondJSONB_Terminatorat the end of the array?
JSONB_String, enc(b),Are you missing a
JSONB_Terminatorhere?
Yes, thank you for the catch. Was also missing a JSONB_Terminator at the very end of the last example to notify the end of an object.
Shivs11
left a comment
There was a problem hiding this comment.
Thank you for the review!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @Shivs11)
docs/tech-notes/jsonb_forward_indexing.md line 5 at r2 (raw file):
This document intends to build on the JSONB value encoding [RFC](https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20171005_jsonb_encoding.md). Previously, it was only possible to value-encode JSONB values in CRDB. The following document proposes a format
nit: "proposes" still sounds like this is a feature that you haven't yet implemented. I'd say "describes" instead.
docs/tech-notes/jsonb_forward_indexing.md line 6 at r2 (raw file):
This document intends to build on the JSONB value encoding [RFC](https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20171005_jsonb_encoding.md). Previously, it was only possible to value-encode JSONB values in CRDB. The following document proposes a format for key-encoding JSONB values in hopes of having a semantic ordering for JSON which in turn allows forward indexes on JSON columns.
ditto "in hopes of" -- maybe "for the purpose of" is better
docs/tech-notes/jsonb_forward_indexing.md line 113 at r2 (raw file):
enc(
'{“a”: ‘[1]’, “b”: {“c”: 0}'::JSONB)
Looks like you're missing a closing }
This commit adds a tech note that goes ahead to explain how JSON will be key encoded in CRDB. It builds on the RFC for JSONB encoding and explains the main motivation behind the requirement for a key encoding in the first place. The tech note also dives into details pertaining to the encoding for each of the present JSON values: NULL, Strings, Numbers, Boolean, Arrays and Objects. Lastly, the tech note also consists of examples showing how JSON values are encoded in hopes of simplifying the concept to a new reader. Release note: None Epic: CRDB-24501
595fa45 to
0a28e35
Compare
Shivs11
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
docs/tech-notes/jsonb_forward_indexing.md line 5 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: "proposes" still sounds like this is a feature that you haven't yet implemented. I'd say "describes" instead.
Done!
docs/tech-notes/jsonb_forward_indexing.md line 6 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
ditto "in hopes of" -- maybe "for the purpose of" is better
Done.
docs/tech-notes/jsonb_forward_indexing.md line 113 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
enc(
'{“a”: ‘[1]’, “b”: {“c”: 0}'::JSONB)Looks like you're missing a closing
}
Done!
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
mgartner
left a comment
There was a problem hiding this comment.
Nicely done! Be sure to update this when we decide how to handle the encoding of an empty JSONB array.
Shivs11
left a comment
There was a problem hiding this comment.
Sure thing @mgartner! Thank you for the review!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
|
bors r+ |
|
Build succeeded: |
Currently, cockroachdb#97928 and cockroachdb#99275 are responsible for laying out a lexicographical ordering for JSON columns to be forward indexable in nature. This ordering is based on the rules posted by Postgres and is defined here: cockroachdb#99849 However, Postgres currently sorts the empty JSON array before any other JSON values. An issue has been opened: https://www.postgresql.org/message-id/17873-826fdc8bbcace4f1%40postgresql.org Thus, this PR intends on replicating this behaviour until the issue has been identified and resolved by Postgres. Epic: CRDB-24501 Release note (sql change): This PR shall now place the empty JSON array to have the lowest (highest) precedence when the JSON column is being sorted in ascending (descending) order.
Currently, cockroachdb#97928 and cockroachdb#99275 are responsible for laying out a lexicographical ordering for JSON columns to be forward indexable in nature. This ordering is based on the rules posted by Postgres and is defined here: cockroachdb#99849 However, Postgres currently sorts the empty JSON array before any other JSON values. An issue has been opened: https://www.postgresql.org/message-id/17873-826fdc8bbcace4f1%40postgresql.org Thus, this PR intends on replicating this behaviour until the issue has been identified and resolved by Postgres. Epic: CRDB-24501 Release note (sql change): This PR shall now place the empty JSON array to have the lowest (highest) precedence when the JSON column is being sorted in ascending (descending) order.
Currently, cockroachdb#97928 and cockroachdb#99275 are responsible for laying out a lexicographical ordering for JSON columns to be forward indexable in nature. This ordering is based on the rules posted by Postgres and is in cockroachdb#99849. However, Postgres currently sorts the empty JSON array before any other JSON values. A Postgres bug report for this has been opened: https://www.postgresql.org/message-id/17873-826fdc8bbcace4f1%40postgresql.org This PR intends on replicating the Postgres behavior. Fixes cockroachdb#105668 Epic: CRDB-24501 Release note: None
Currently, cockroachdb#97928 and cockroachdb#99275 are responsible for laying out a lexicographical ordering for JSON columns to be forward indexable in nature. This ordering is based on the rules posted by Postgres and is in cockroachdb#99849. However, Postgres currently sorts the empty JSON array before any other JSON values. A Postgres bug report for this has been opened: https://www.postgresql.org/message-id/17873-826fdc8bbcace4f1%40postgresql.org This PR intends on replicating the Postgres behavior. Fixes cockroachdb#105668 Epic: CRDB-24501 Release note: None
Currently, cockroachdb#97928 and cockroachdb#99275 are responsible for laying out a lexicographical ordering for JSON columns to be forward indexable in nature. This ordering is based on the rules posted by Postgres and is in cockroachdb#99849. However, Postgres currently sorts the empty JSON array before any other JSON values. A Postgres bug report for this has been opened: https://www.postgresql.org/message-id/17873-826fdc8bbcace4f1%40postgresql.org This PR intends on replicating the Postgres behavior. Fixes cockroachdb#105668 Epic: CRDB-24501 Release note: None
101260: sql: replicating JSON empty array ordering found in Postgres r=mgartner a=Shivs11 Currently, #97928 and #99275 are responsible for laying out a lexicographical ordering for JSON columns to be forward indexable in nature. This ordering is based on the rules posted by Postgres and is in #99849. However, Postgres currently sorts the empty JSON array before any other JSON values. A Postgres bug report for this has been opened: https://www.postgresql.org/message-id/17873-826fdc8bbcace4f1%40postgresql.org This PR intends on replicating the Postgres behavior. Fixes #105668 Epic: CRDB-24501 Release note: None 108160: roachtest/awsdms: run once a week instead r=Jeremyyang920 a=otan Save a bit of mad dosh by running awsdms once a weekly instead of daily. We don't need this tested every week. Epic: None Release note: None 108300: schemachanger: Unskip some backup tests r=Xiang-Gu a=Xiang-Gu Randomly skip subtests in the BACKUP/RESTORE suites before parallelizing them. Epic: None Release note: None 108328: rowexec: fix TestUncertaintyErrorIsReturned under race r=yuzefovich a=yuzefovich We just saw a case when `TestUncertaintyErrorIsReturned` failed under race because we got a different DistSQL plan. This seems plausible in case the range cache population (which the test does explicitly) isn't quick enough for some reason, so this commit allows for the DistSQL plan to match the expectation via `SucceedsSoon` (if we happen to get a bad plan, then the following query execution should have the up-to-date range cache). Fixes: #108250. Release note: None 108341: importer: fix stale comment on mysqlStrToDatum r=mgartner,DrewKimball a=otan Release note: None Epic: None From #108286 (review) 108370: go.mod: bump Pebble to fffe02a195e3 r=RahulAggarwal1016 a=RahulAggarwal1016 fffe02a1 db: simplify ScanInternal() df7e2ae1 vfs: deflake TestDiskHealthChecking_Filesystem ff5c929a Rate Limit Scan Statistics af8c5f27 internal/cache: mark panic messages as redaction-safe Epic: none Release note: none 108379: changefeedccl: deflake TestChangefeedSchemaChangeBackfillCheckpoint r=miretskiy a=jayshrivastava Previously, the test `TestChangefeedSchemaChangeBackfillCheckpoint` would fail because it would read a table span too early. A schema change using the delcarative schema changer will update a table span to point to a new set of ranges. Previously, this test would use the span from before the schema change, which is incorrect. This change makes it use the span from after the schema change. I stress tested this 30k times under the new schema changer and 10k times under the legacy schema changer to ensure the test is not flaky anymore. Fixes: #108084 Release note: None Epic: None Co-authored-by: Shivam Saraf <shivam.saraf@cockroachlabs.com> Co-authored-by: Oliver Tan <otan@cockroachlabs.com> Co-authored-by: Xiang Gu <xiang@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Rahul Aggarwal <rahul.aggarwal@cockroachlabs.com> Co-authored-by: Jayant Shrivastava <jayants@cockroachlabs.com>
This commit adds a tech note that goes ahead to explain
how
JSONwill be key encoded in CRDB. It builds on theRFC for
JSONBencoding and explains the main motivationbehind the requirement for a key encoding in the first
place. The tech note also dives into details pertaining
to the encoding for each of the present
JSONvalues:NULL, Strings, Numbers, Boolean, Arrays and Objects.Lastly, the tech note also consists of examples showing
how
JSONvalues are encoded in hopes of simplifying theconcept to a new reader.
Release note: None
Epic: CRDB-24501