Conversation
pkg/sql/tablewriter.go
Outdated
| func (td *tableDeleter) row( | ||
| ctx context.Context, values tree.Datums, traceKV bool, | ||
| ) (tree.Datums, error) { | ||
| if td.batchSize > 10000 { |
There was a problem hiding this comment.
extract the constant in the global scope and give it an explanatory comment.
|
if this merges into master, please also cherry-pick into 2.0 |
|
Review status: 0 of 1 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. pkg/sql/tablewriter.go, line 740 at r1 (raw file): Previously, knz (kena) wrote…
Not sure the constant needs to be in the global scope given it is only accessed here, but 👍 on making a constant. Also, shouldn't this be Comments from Reviewable |
In the absence of a fast path deletion, `DELETE` would generate one potentially giant batch and OOM the gateway node. This became obvious quickly via heap profiling. Added chunking of the deletions to `tableDeleter`. SQL folks may have stronger opinions on how to achieve this, or a better idea of a preexisting chunking mechanism that works more reliably. If nothing else, this change serves as a prototype to fix cockroachdb#17921. With this change, `roachtest run drop` works (as in, it doesn't out-of-memory right away; the run takes a long time so I can't yet confirm that it actually passes). Release note (sql change): deleting many rows at once now consumes less memory.
|
@tschottdorf (+@knz, @andreimatei) when I ran this locally, I noticed that we seem to be scanning and buffering the entire set of rows to be deleted. It seems with this change that the total memory footprint should not soar into the GiBs. Why doesn't SQL consume rows from the scanner in a streaming fashion to feed to the table writer? |
|
Spencer is this what you're looking for?
#16180
…On Feb 24, 2018 1:56 PM, "Spencer Kimball" ***@***.***> wrote:
@tschottdorf <https://github.com/tschottdorf> ***@***.***
<https://github.com/knz>, @andreimatei <https://github.com/andreimatei>)
when I ran this locally, I noticed that we seem to be scanning and
buffering the entire set of rows to be deleted. It seems with this change
that the total memory footprint should not soar into the GiBs. Why doesn't
SQL consume rows from the scanner in a streaming fashion to feed to the
table writer?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#22991 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAXBcTG21dwJAhCTJST8yPRyKV7CxcECks5tYFtagaJpZM4SQbsB>
.
|
In the absence of a fast path deletion,
DELETEwould generate onepotentially giant batch and OOM the gateway node. This became obvious
quickly via heap profiling.
Added chunking of the deletions to
tableDeleter. SQL folks may havestronger opinions on how to achieve this, or a better idea of a
preexisting chunking mechanism that works more reliably. If nothing
else, this change serves as a prototype to fix #17921.
With this change,
roachtest run dropworks (as in, it doesn'tout-of-memory right away; the run takes a long time so I can't yet
confirm that it actually passes).
Release note (sql change): deleting many rows at once now consumes less
memory.