server: return non-live, live info per table#83677
server: return non-live, live info per table#83677craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
@ajwerner I create this version with the most straight forward solution. Are there improvements you can see I should be adding here from the start? |
86d6787 to
4c4e5f5
Compare
|
My fear is that for large tables this is going to now make the request just not work where before it worked. I don't really know what to do about that. |
|
Anything we can do with the info about total range count from here which is just a little below (and I could move my changes to after that)? |
|
I have a cute idea which might solve both the problems here and the problems in @yuzefovich how crazy does it sound? It sort of looks like if I had an implementation and hooked it up in the below code, then maybe it'd work? Is there more to know about planning to make the pieces fit together? cockroach/pkg/sql/colexec/builtin_funcs.go Line 122 in 928f605 |
yuzefovich
left a comment
There was a problem hiding this comment.
I'm not sure I follow your idea - what exactly would benefit from the vectorization? Or are you thinking that in the vectorized infrastructure we could create a specialized operator for a particular builtin function which internally has parallelism (to issue those range status lookups) unlike in the row-by-row infra? That seems reasonable to me, and I think the code you're pointing at would be the right entry point, and then things should mostly just work.
There would be some minor changes to expose the fact of concurrency in this builtin (if we're using a Txn, we'd have to ask for a LeafTxn, also vectorizedFlowCreator.operatorConcurrency from vectorized_flow.go will need to be set to true - actually doing that will ensure that we'll get a LeafTxn).
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
Precisely.
Fortunately |
Hm, no, nothing else comes to mind (as long as the queries using this new builtin don't also include things like CDC / BulkIO - those don't run through the vectorized infra). |
|
@yuzefovich maybe you can help me out. I typed the code but it doesn't quite work (also I'm sure I'm not doing memory monitoring correctly, or at all). My understanding of why is that we don't actually vectorize the rendering of the expression. What can I do to force it to use the vectorized operator? leads to the builtin being called as part of evaluation in the |
|
Maybe this code here is the problem, or at least, the source of inflexibility: cockroach/pkg/sql/distsql_spec_exec_factory.go Lines 168 to 176 in 928f605 |
|
I think the problem is the usage of cockroach/pkg/sql/distsql_physical_planner.go Line 3083 in 78caa5c |
|
(Maybe it's not a "materialized view" but it's something (namely "virtual table" inside of it - |
|
Hmm, is there a way to force the code to look inside the expressions and inject some intermediate render node or something like that if there are optimized functions? I imagine there's a way to go from rows->columns->render->rows |
|
No, unfortunately nothing like that exists. The thing is that this |
|
Oh, actually, I take this, at least partially, back - the problem might be simpler than I initially thought. It does look like the rendering is in |
|
I typed up #83689 for that idea, but I think it's broken. However, it might be sufficient for your prototype. |
|
I didn't really follow your draft PR. Maybe we can collaborate on this for a little bit this afternoon if you're willing or some time next week. I've reached the point in the code where my lack of understanding of concepts means I'd need to learn a lot to make progress. |
|
(Bart delays provide ample time to code lol) |
|
Indeed: |
45244fa to
f86dcc9
Compare
|
@ajwerner @yuzefovich do you know if the range tables are not available during tests? |
|
Your code seems sad here: Line 933 in 8b83a21 It seems that the type is wrong the column it claims. Maybe you need to name the projection with the cast? I'm not sure. |
Before this change, we'd just log the error body, which often is not very helpful. The format specifier makes me think that the intention was to log the stacks. This made debugging [this](cockroachdb#83677 (comment)) trivial as opposed to hard. Release note: None
|
83875: opt: fix memo cycle caused by join ordering r=DrewKimball a=DrewKimball
In some rare cases when null-rejection rules fail to fire, a redundant filter
can be inferred in an `InnerJoin` - `LeftJoin` complex. This could previously
result in the `JoinOrderBuilder` attempting to add a `Select` to the same memo
group as its input, which would create a memo cycle. This patch prevents the
`JoinOrderBuilder` from adding the `Select` to the memo in such cases.
What follows is a more detailed explanation of the conditions that could
previously cause a memo cycle.
`InnerJoin` operators have two properties that make them more 'reorderable'
than other types of joins: (1) their conjuncts can be reordered separately
and (2) new conjuncts can be inferred from equalities. As a special case of
(1), an `InnerJoin` can be pushed into the left side of a `LeftJoin`, leaving
behind any `Select` conjuncts that reference the right side of the `LeftJoin`.
This allows the `JoinOrderBuilder` to make the following transformation:
```
(InnerJoin
A
(InnerJoin
B
(LeftJoin
C
D
c=d
)
b=c
)
a=b, a=d
)
=>
(InnerJoin
A
(Select
(LeftJoin
(InnerJoin
B
C
b=c
)
D
c=d
)
b=d // Inferred filter!
)
a=b, a=d
)
```
Note the new `b=d` filter that was inferred and subsequently left on
a `Select` operator after the reordering. The crucial point is that
this filter is redundant - the input to the `Select` is already a
valid reordering of the `BCD` join complex.
The `JoinOrderBuilder` avoids adding redundant filters to `InnerJoin`
operators, but does not do the same for the `Select` case because it
was assumed that the filters left on the `Select` would never be redundant.
This is because the `a=d` filter *should* rejects nulls, so the `LeftJoin`
should have been simplified. However, in rare cases null-rejection does not
take place. Because the input to the `Select` is already a valid reordering,
the `JoinOrderBuilder` ends up trying to add the `Select` to the same group
as its input - namely, the `BCD` join group. This causes a cycle in the memo.
Fixes #80901
Release note (bug fix): Fixed a bug that could cause an optimizer
panic in rare cases when a query had a left join in the input of
an inner join.
83945: storage/metamorphic: Add MVCC delete range using tombstone r=erikgrinaker a=itsbilal
This change adds MVCCDeleteRangeUsingTombstone to the MVCC
metamorphic tests. MVCCDeleteRange had already existed in
this test suite, so this ended up being a relatively simple
addition.
One part of #82545, with possibly more parts to follow
as other MVCC-level operations are added that utilize
`writer.{Put,Clear}{MVCC,Engine}RangeKey`.
Release note: None.
83992: server: log stacks on 500 errors r=ajwerner a=ajwerner
Before this change, we'd just log the error body, which often is not very
helpful. The format specifier makes me think that the intention was to log
the stacks. This made debugging [this](#83677 (comment))
trivial as opposed to hard.
Release note: None
84015: bazel: clear configurations when running `git grep` in `check.sh` r=miretskiy,mari-crl a=rickystewart
The configurations `grep.{column,lineNumber,fullName}` can be set
globally or on a per-user basis and change the output of `git grep`,
which breaks checks that do exact-string matching. We manually clear
these configurations for the `git grep`s in this file to ensure the
output is predictable on any machine.
Release note: None
Co-authored-by: Andrew Kimball <andrewekimball@gmail.com>
Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Co-authored-by: Andrew Werner <awerner32@gmail.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
This commit adds 3 new parameter to the table details endpoint: - dataTotalBytes - dataLiveBytes - dataLivePercentage Partially addresses cockroachdb#82617 Release note (api change): Add information about total bytes, live (non MVCC) bytes and live (non MVCC) percentage to Table Details endpoint.
|
TFTR! |
|
Build succeeded: |
This commit adds 3 new parameter to the table
details endpoint:
Partially addresses #82617
Release note (api change): Add information about total bytes,
non live (MVCC) bytes and non live (MVCC) percentage to
Table Details endpoint.