jsonpath: add like_regex, string and null scalars#143240
jsonpath: add like_regex, string and null scalars#143240craig[bot] merged 3 commits intocockroachdb:masterfrom
like_regex, string and null scalars#143240Conversation
This commit adds string scalars, enabling string comparisons within jsonpath queries. Epic: None Release note (sql change): Support string comparisons within jsonpath queries.
d5e30ec to
86603a3
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
Very nice!
Reviewed 4 of 4 files at r1, 3 of 3 files at r2, 10 of 10 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @normanchenn)
-- commits line 28 at r3:
nit: should we mention that flag part is not supported? Or do you think we'll add that soon?
pkg/util/jsonpath/parser/jsonpath.y line 459 at r1 (raw file):
unreserved_keyword: AND
Hm, why AND, NOT, and OR are being removed in the first commit?
pkg/sql/logictest/testdata/logic_test/jsonb_path_query line 794 at r2 (raw file):
query T SELECT jsonb_path_query('{}', 'null == null');
nit: did you mean to make this !=?
pkg/util/jsonpath/eval/operation.go line 112 at r3 (raw file):
return jsonpathBoolUnknown, err } res := r.Find([]byte(*text))
nit: I think r.MatchString would be cleaner here.
pkg/util/jsonpath/parser/jsonpath.y line 460 at r1 (raw file):
any_identifier: IDENT
nit: should we remove IDENT since I think it's no longer used?
pkg/sql/logictest/testdata/logic_test/jsonb_path_query line 851 at r3 (raw file):
"987_650" "987_424"
nit: should we add a couple test cases where regex uses a backslash?
86603a3 to
3366e20
Compare
normanchenn
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @normanchenn and @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: should we mention that
flagpart is not supported? Or do you think we'll add that soon?
I'm not sure if we should support it yet - I'm not sure if theres a straightforward way to add these flags with the current implementation.
pkg/util/jsonpath/parser/jsonpath.y line 459 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, why AND, NOT, and OR are being removed in the first commit?
I don't think AND, NOT, and OR should be considered unreserved keywords since when we're scanning for these tokens, they are symbols such as &&, ||, and !, so when lexbase.GetKeywordID is called, I think it would make more sense to identify the string as an IDENT instead of AND for a key such as "and"
pkg/util/jsonpath/parser/jsonpath.y line 460 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: should we remove IDENT since I think it's no longer used?
We are still using IDENT, because most key accessor names are identified as IDENT during scanning. For example, in $.a, a is an IDENT. See the call to lexbase.GetKeywordID in JSONPathScanner.scanIdent.
yuzefovich
left a comment
There was a problem hiding this comment.
Nice! Glad that UnescapePattern does what we need (and, philosophically speaking, it makes sense that we'd need to do something very similar for LIKE comparison operator). I do have one more comment.
Reviewed 10 of 10 files at r4, 13 of 13 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @normanchenn)
pkg/util/jsonpath/parser/jsonpath.y line 459 at r1 (raw file):
Previously, normanchenn (Norman Chen) wrote…
I don't think AND, NOT, and OR should be considered unreserved keywords since when we're scanning for these tokens, they are symbols such as
&&,||, and!, so whenlexbase.GetKeywordIDis called, I think it would make more sense to identify the string as an IDENT instead of AND for a key such as "and"
I see, sgtm.
pkg/util/jsonpath/parser/jsonpath.y line 460 at r1 (raw file):
Previously, normanchenn (Norman Chen) wrote…
We are still using IDENT, because most key accessor names are identified as IDENT during scanning. For example, in
$.a, a is an IDENT. See the call tolexbase.GetKeywordIDinJSONPathScanner.scanIdent.
Ack, thanks, I missed this usage.
pkg/sql/logictest/testdata/logic_test/jsonb_path_query line 794 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: did you mean to make this
!=?
nit: this fixup belongs in the second commit, not third 😃
pkg/sql/logictest/testdata/logic_test/jsonb_path_query line 879 at r5 (raw file):
"https://test.com/path"
nit: double newline.
pkg/util/jsonpath/eval/operation.go line 115 at r5 (raw file):
regex := op.Right.(jsonpath.Regex) regex.Regex, err = eval.UnescapePattern(regex.Regex, `\`, false)
Hm, I think this might be problematic since we're modifying the right operand in place. What will happen if we evaluate the same jsonpath like_regex expression on multiple JSON objects (i.e. multiple rows from the table)? I'd guess that we'd keep unescaping the right side every time, in-place.
Also nit: add inlined comments like /* escapeToken */ and for false too.
yuzefovich
left a comment
There was a problem hiding this comment.
I did a quick run of relevant pg_regress files, and one difference caught my eye:
select jsonb_path_query('[null, 1, "abc", "abd", "aBdC", "abdacb", "babc", "adc\nabc", "ab\nadc"]', 'lax $[*] ? (@ like_regex "^ab.*c")');PG returns two rows whereas we return three. At the same time we do match PG on:
select 'ab\nadc' like '^ab.*c';Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @normanchenn)
pkg/util/jsonpath/eval/operation.go line 115 at r5 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, I think this might be problematic since we're modifying the right operand in place. What will happen if we evaluate the same jsonpath like_regex expression on multiple JSON objects (i.e. multiple rows from the table)? I'd guess that we'd keep unescaping the right side every time, in-place.
Also nit: add inlined comments like
/* escapeToken */and forfalsetoo.
Hm, on the second thought, we are operating with the value, not the pointer, so perhaps this will be fine. Actually, your tests should already cover this - we should be invoking evalRegex multiple times when we have multiple paths.
This commit adds null scalars, enabling null comparisons within jsonpath queries. Epic: None Release note (sql change): Support null comparisons within jsonpath queries.
3366e20 to
a84f0df
Compare
This commit add `like_regex` predicate evaluation support. Flags for `like_regex` are not supported yet. Epic: None Release note (sql change): Add `like_regex` predicate evaluation support for jsonpath queries. Flags for `like_regex` are not supported yet.
a84f0df to
7cff4e8
Compare
normanchenn
left a comment
There was a problem hiding this comment.
Should be fixed now.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/util/jsonpath/eval/operation.go line 115 at r5 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, on the second thought, we are operating with the value, not the pointer, so perhaps this will be fine. Actually, your tests should already cover this - we should be invoking
evalRegexmultiple times when we have multiple paths.
I changed this to not use UnescapePattern anymore, as setting allowEscapes = true during scanning and retrieving the json string with AsText() seems to do the trick.
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 13 of 13 files at r8, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @normanchenn)
|
TFTR! bors r+ |
jsonpath/parser: add support for string scalars
This commit adds string scalars, enabling string comparisons within
jsonpath queries.
Epic: None
Release note (sql change): Support string comparisons within jsonpath
queries.
jsonpath/parser: add support for null scalars.
This commit adds null scalars, enabling null comparisons within jsonpath
queries.
Epic: None
Release note (sql change): Support null comparisons within jsonpath
queries.
jsonpath: add
like_regexsupportThis commit add
like_regexpredicate evaluation support. Flags forlike_regexare not supported yet.Informs: #143243
Epic: None
Release note (sql change): Add
like_regexpredicate evaluation supportfor jsonpath queries. Flags for
like_regexare not supported yet.