Skip to content

sql, kv: use Delete instead of DeleteRange to lessen impact on tscache#23258

Merged
spencerkimball merged 1 commit intocockroachdb:masterfrom
spencerkimball:delete-from
Mar 1, 2018
Merged

sql, kv: use Delete instead of DeleteRange to lessen impact on tscache#23258
spencerkimball merged 1 commit intocockroachdb:masterfrom
spencerkimball:delete-from

Conversation

@spencerkimball
Copy link
Copy Markdown
Member

@spencerkimball spencerkimball commented Mar 1, 2018

Previously, on the slow path for deletion from SQL, rows were deleted
using a call to DeleteRange in order to remove all column families
with a single statement. It turns out this is better done using one
call per column family to Delete. DeleteRange must update the
write timestamp cache in order to avoid lost update anomalies. By
lessening the pressure on the timestamp cache, we can avoid pushing
a transaction's commit timestamp forward and triggering a retry. This
allow substantially longer / larger DELETE FROM statements to run
to completion, assuming there is no other concurrency which will force
a retry.

In conjunction, a realization from #21165 was that large DELETE FROM
statements can end up running forever. This occurs because they are
retryable from the SQL executor and continue to get retryable errors
due to the timestamp cache being unable to avoid pushing their
timestamps forward.

Fixes #17921

Release note (SQL change): will fix endless churn experienced from
large DELETE FROM statements, either by allowing them to complete,
or by exiting with an error message indicating the transaction is too
large to complete.

@spencerkimball spencerkimball requested review from a team and bdarnell March 1, 2018 00:13
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@spencerkimball
Copy link
Copy Markdown
Member Author

+cc @petermattis

@tbg
Copy link
Copy Markdown
Member

tbg commented Mar 1, 2018

@spencerkimball #17921 seems mentioning as well, I think I reported the same behavior there.

@spencerkimball
Copy link
Copy Markdown
Member Author

@tschottdorf yes I believe this fixes that problem, as it's exactly what my testing was showing. We would spin endlessly because it would auto retry. Now either we'll finish the big DELETE FROM because there's zero pressure on the write timestamp cache, or we'll get a retry but fail with a non-retryable error because our refresh spans were discarded because the transaction became too large.

@petermattis
Copy link
Copy Markdown
Collaborator

:lgtm: though I only looked at the SQL change. Someone more familiar with the recent refresh spans work will have to look at the other changes.


Review status: 0 of 4 files reviewed at latest revision, 1 unresolved discussion.


pkg/sql/sqlbase/rowwriter.go, line 1028 at r1 (raw file):

			log.VEventf(ctx, 2, "Del %s", keys.PrettyPrint(rd.Helper.primIndexValDirs, rd.key))
		}
		b.Del(rd.key)

I think you should pass &rd.key here, which is the same as rd.key except it avoids the allocation when assigning the Key to an interface.


Comments from Reviewable

@tbg
Copy link
Copy Markdown
Member

tbg commented Mar 1, 2018 via email

@bdarnell
Copy link
Copy Markdown
Contributor

bdarnell commented Mar 1, 2018

:lgtm:


Reviewed 4 of 4 files at r1.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


pkg/sql/sqlbase/rowwriter.go, line 1023 at r1 (raw file):

	// Delete the row.
	for _, family := range rd.Helper.TableDesc.Families {

Are there any issues with schema changes here? The old code was guaranteed to delete all families that did or will exist. I think schema leases will work here but @danhhz should confirm.


Comments from Reviewable

@spencerkimball
Copy link
Copy Markdown
Member Author

@tschottdorf I doubt it will fail. I've tested this up to 1M rows and it works fine with serializable (keep in mind the read timestamp cache is getting 10,000 row spans only, so that's just 100 spans). If the system is otherwise busy, and the read timestamp cache is full, then we'd end up hitting that error, but unless I misunderstand what's happening in the test, it'll run just fine. Also added unittests.

I'm having trouble running the roachtest. My environment might be way out of date:

SPENCERs-MacBook-Pro:cockroach spencerkimball$ ROACHPROD_USER=spencer roachtest run drop
=== RUN   drop/tpcc/w=100,nodes=9
> roachprod create spencer-1519908070-drop-tpcc-w-100-nodes-9 -n 9
Creating cluster spencer-1519908070-drop-tpcc-w-100-nodes-9 with 9 nodes
Error:  in provider: gce: Command: gcloud [compute instances create --subnet default --maintenance-policy MIGRATE --service-account 21965078311-compute@developer.gserviceaccount.com --scopes default,storage-rw --image ubuntu-1604-xenial-v20171002 --image-project ubuntu-os-cloud --boot-disk-size 10 --boot-disk-type pd-ssd --local-ssd interface=SCSI --machine-type n1-standard-4 --labels lifetime=12h0m0s --metadata-from-file startup-script=/var/folders/83/r_nmcwd969g5qc0b7my9wl900000gn/T/gce-startup-script952910348 --project cockroach-ephemeral]
Output: usage: gcloud compute instances create  NAMES [NAMES ...] [optional flags]
ERROR: (gcloud.compute.instances.create) unrecognized arguments:
  --service-account (did you mean '--account'?)
  --labels (did you mean '--local-ssd'?)
  lifetime=12h0m0s
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0001
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0002
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0003
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0004
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0005
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0006
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0007
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0008
  spencer-1519908070-drop-tpcc-w-100-nodes-9-0009
: exit status 2

Review status: 3 of 5 files reviewed at latest revision, 2 unresolved discussions.


pkg/sql/sqlbase/rowwriter.go, line 1023 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Are there any issues with schema changes here? The old code was guaranteed to delete all families that did or will exist. I think schema leases will work here but @danhhz should confirm.

The schema leases better work! From what I'm seeing they do, because the transaction's deadline continues to get moved forward as the transaction goes on and on.


pkg/sql/sqlbase/rowwriter.go, line 1028 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I think you should pass &rd.key here, which is the same as rd.key except it avoids the allocation when assigning the Key to an interface.

Done. I did the same for the index entries above as well, which could be an even tighter loop than here.


Comments from Reviewable

@danhhz
Copy link
Copy Markdown
Contributor

danhhz commented Mar 1, 2018

Review status: 3 of 5 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


pkg/sql/sqlbase/rowwriter.go, line 1023 at r1 (raw file):

Previously, spencerkimball (Spencer Kimball) wrote…

The schema leases better work! From what I'm seeing they do, because the transaction's deadline continues to get moved forward as the transaction goes on and on.

It's been a while since I've had this all loaded up, but the reader code ignores any columns that are not in the right schema change state, so I think this is correct. @jordanlewis has thought about this much more recently than I have, so we may want to double check with him


Comments from Reviewable

@spencerkimball
Copy link
Copy Markdown
Member Author

@tschottdorf:

> roachprod ssh spencer-1519936200-drop-tpcc-w-100-nodes-9:1 -- ./cockroach sql --insecure -e "DELETE FROM tpcc.stock"
[   6] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (17s)
[   7] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (1m17s)
[   8] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (2m17s)
[   9] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (3m17s)
[  10] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (4m17s)
[  11] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (5m17s)
[  12] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (6m17s)
[  13] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (7m17s)
[  14] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (8m17s)
[  15] drop/tpcc/w=100,nodes=9: DELETE FROM tpcc.stock (9m17s)
DELETE 10000000
> roachprod ssh spencer-1519936200-drop-tpcc-w-100-nodes-9:1 -- ./cockroach sql --insecure -e "TRUNCATE TABLE tpcc.stock"
TRUNCATE
> roachprod ssh spencer-1519936200-drop-tpcc-w-100-nodes-9:1 -- ./cockroach sql --insecure -e "DROP DATABASE tpcc"
DROP DATABASE
> roachprod ssh spencer-1519936200-drop-tpcc-w-100-nodes-9:1 -- ./cockroach sql --insecure -e "ALTER RANGE default EXPERIMENTAL CONFIGURE ZONE '
gc:
  ttlseconds: 1
'"
CONFIGURE ZONE 1
> roachprod get spencer-1519936200-drop-tpcc-w-100-nodes-9 logs artifacts/drop/tpcc/w=100,nodes=9/logs
spencer-1519936200-drop-tpcc-w-100-nodes-9: getting logs artifacts/drop/tpcc/w=100,nodes=9/logs
.....
   1: done
   2: done
   3: done
   4: done
   5: done
   6: done
   7: done
   8: done
   9: done
> roachprod destroy spencer-1519936200-drop-tpcc-w-100-nodes-9
Destroying cluster spencer-1519936200-drop-tpcc-w-100-nodes-9 with 9 nodes
[  16] drop/tpcc/w=100,nodes=9: destroying cluster (34s)
OK
--- PASS: drop/tpcc/w=100,nodes=9 (972.31s)
PASS

@spencerkimball spencerkimball force-pushed the delete-from branch 2 times, most recently from 484bc3e to 2ab7955 Compare March 1, 2018 20:52
Previously, on the slow path for deletion from SQL, rows were deleted
using a call to `DeleteRange` in order to remove all column families
with a single statement. It turns out this is better done using one
call per column family to `Delete`. `DeleteRange` must update the
write timestamp cache in order to avoid lost update anomalies. By
lessening the pressure on the timestamp cache, we can avoid pushing
a transaction's commit timestamp forward and triggering a retry. This
allow substantially longer / larger `DELETE FROM` statements to run
to completion, assuming there is no other concurrency which will force
a retry.

In conjunction, a realization from cockroachdb#21165 was that large `DELETE FROM`
statements can end up running forever. This occurs because they are
retryable from the SQL executor and continue to get retryable errors
due to the timestamp cache being unable to avoid pushing their
timestamps forward.

Release note (sql change): will fix endless churn experienced from
large `DELETE FROM` statements, either by allowing them to complete,
or by exiting with an error message indicating the transaction is too
large to complete.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DELETE FROM ... [returning nothing] crashes a node

6 participants