Skip to content

sql: fix two bugs with TRUNCATE#28721

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
vivekmenezes:truncatebackfill
Aug 27, 2018
Merged

sql: fix two bugs with TRUNCATE#28721
craig[bot] merged 3 commits intocockroachdb:masterfrom
vivekmenezes:truncatebackfill

Conversation

@vivekmenezes
Copy link
Copy Markdown
Contributor

@vivekmenezes vivekmenezes commented Aug 16, 2018

sql: correctly resolve schema mutations during TRUNCATE
related to #27687

sql: mark schema change job as successful when table DROP/TRUNCATE
The job is modified in the same transaction as the user's DROP/TRUNCATE transaction.

@vivekmenezes vivekmenezes requested review from a team and knz August 16, 2018 20:03
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First two commits :lgtm_strong: .

Last commit: :lgtm: on principle, I like the idea to cancel ongoing jobs. However I recommend to actually stop the jobs with an error message that says "table was dropped" so that the reason can be inspected with SHOW JOBS.

Reviewed 1 of 1 files at r1, 6 of 6 files at r2, 4 of 4 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/drop_table.go, line 359 at r3 (raw file):

	}

	// Mark all jobs scheduled for schema changes as successful.

Just for clarity: can you extend the commit message and the PR description to explain this new behavior, and outline at a high level:

  • what happens if the DROP is aborted (e.g. its txn is rolled back) -- what happens with these jobs?
  • what happens if the DROP is retried -- what happens with the jobs?
  • what happens if a user uses AS OF SYSTEM TIME or a restore in-between the time the schema change jobs were issued, and the time where DROP was issued.

pkg/sql/drop_table.go, line 372 at r3 (raw file):

				return err
			}
			if err := job.WithTxn(p.txn).Succeeded(ctx, jobs.NoopFn); err != nil {

Maybe instead fail the job with an error that says "table has been dropped". This way the reason why the job was stopped becomes clear in the event log / SHOW JOBS.

Copy link
Copy Markdown
Contributor Author

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review this.

The code path can be executed via TRUNCATE and DROP. In the case of TRUNCATE the schema changes were short circuited and indeed finished. In the case of DROP, the table has been dropped so one argue those schema changes were short circuited to and completed. I suppose one can ask, is there an action the user can take if there saw the extra message, or would they find it uncomfortable to see the schema change complete?

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/drop_table.go, line 359 at r3 (raw file):

Previously, knz (kena) wrote…

Just for clarity: can you extend the commit message and the PR description to explain this new behavior, and outline at a high level:

  • what happens if the DROP is aborted (e.g. its txn is rolled back) -- what happens with these jobs?
  • what happens if the DROP is retried -- what happens with the jobs?
  • what happens if a user uses AS OF SYSTEM TIME or a restore in-between the time the schema change jobs were issued, and the time where DROP was issued.

Added a comment to the PR saying that the change to jobs table is done in the same txn. The AS OF SYSTEM TIME question seems to be irrelevant to this PR.

Copy link
Copy Markdown
Contributor Author

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/drop_table.go, line 372 at r3 (raw file):

Previously, knz (kena) wrote…

Maybe instead fail the job with an error that says "table has been dropped". This way the reason why the job was stopped becomes clear in the event log / SHOW JOBS.

Done.

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes my point in my previous comment is that if the job gets aborted because the table is dropped, this is not a normal schema change completion.

If a schema change is marked to have completed successfully, then there exists a point in time which the user can use with AS OF SYSTEM TIME to go and observe the data just after the schema change has completed.

If you make the job table lie about the completion (by marking the job as completed whereas the change did not, in fact, complete), then the user who uses AS OF SYSTEM TIME to go to the end of the job will see incorrect/invalid/incomplete data in their database.

This is why I think it is important to mark the job as failed and not succeeded.

Reviewed 6 of 6 files at r5, 4 of 4 files at r6.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/drop_table.go, line 359 at r3 (raw file):

Previously, vivekmenezes wrote…

Added a comment to the PR saying that the change to jobs table is done in the same txn. The AS OF SYSTEM TIME question seems to be irrelevant to this PR.

Where is the comment?
I was thinking to paste it at the top of the PR (in the first description) because this is the text that gets copied to the git log. Comments below that do not go to the git log.


pkg/sql/drop_table.go, line 372 at r3 (raw file):

Previously, vivekmenezes wrote…

Done.

I don't see a change?

Copy link
Copy Markdown
Contributor Author

@vivekmenezes vivekmenezes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to review this and I appreciate your comments.

Well what I'm saying is that the job is marked successful when the schema change has completed. The job has a timestamp at which it was marked successful. If you use AS OF SYSTEM TIME before that time you should not expect the schema change to have been completed. If you use a timestamp after that you can expect 1. The schema change complete on a TRUNCATE table, or 2. The schema change complete on a DROP table (which has no schema by definition).

If there is some confusion in this matter it's worthy to consider 1. that this set of events happens rarely, and 2. when it does happen is there really going to be confusion.

Thanks

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/drop_table.go, line 359 at r3 (raw file):

Previously, knz (kena) wrote…

Where is the comment?
I was thinking to paste it at the top of the PR (in the first description) because this is the text that gets copied to the git log. Comments below that do not go to the git log.

Done.


pkg/sql/drop_table.go, line 372 at r3 (raw file):

Previously, knz (kena) wrote…

I don't see a change?

I'm arguing against this change.

@vivekmenezes
Copy link
Copy Markdown
Contributor Author

@knz do you have any thoughts on this PR? Thanks!

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still have me on record that if a schema change is aborted by a DROP, then it should be marked as aborted not completed.

But in the end this is a subjective argument from the perspective of UX, not a correctness argument. So your guess is probably as good as mine.

I have approved the PR already so feel free to move forward.

Reviewed 5 of 6 files at r8, 2 of 4 files at r9.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/drop_table.go, line 372 at r3 (raw file):

Previously, vivekmenezes wrote…

I'm arguing against this change.

I was confused by the word "Done".

related to cockroachdb#27687

Release note (sql change): fixes table descriptor corruption when
TRUNCATE TABLE is run while DROP COLUMN is being processed.
The job is modified in the same transaction as the user's
DROP/TRUNCATE transaction.

Release note (sql change): Fixes problem where when a DROP/TRUNCATE is
run while a schema change like CREATE INDEX is running, the corresponding
job would stay running indefinitely.
@vivekmenezes
Copy link
Copy Markdown
Contributor Author

Thank you for the review!

@vivekmenezes
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 27, 2018

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request Aug 27, 2018
28721: sql: fix two bugs with TRUNCATE r=vivekmenezes a=vivekmenezes

sql: correctly resolve schema mutations during TRUNCATE
related to #27687

sql: mark schema change job as successful when table DROP/TRUNCATE
The job is modified in the same transaction as the user's DROP/TRUNCATE transaction.


29134: rangefeed: small perf-related changes r=nvanbenschoten a=nvanbenschoten

This PR includes two small perf tweak that came up
which running cdc/w=100/nodes=3/init=true.

Release note: None

Co-authored-by: Vivek Menezes <vivek@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 27, 2018

Build succeeded

@craig craig bot merged commit e6c904f into cockroachdb:master Aug 27, 2018
@vivekmenezes vivekmenezes deleted the truncatebackfill branch August 28, 2018 00:45
craig bot pushed a commit that referenced this pull request Aug 29, 2018
29262: backport-2.1: sql: fix two bugs with TRUNCATE r=vivekmenezes a=vivekmenezes

Backport 3/3 commits from #28721.

/cc @cockroachdb/release

---

sql: correctly resolve schema mutations during TRUNCATE
related to #27687

sql: mark schema change job as successful when table DROP/TRUNCATE
The job is modified in the same transaction as the user's DROP/TRUNCATE transaction.



Co-authored-by: Vivek Menezes <vivek@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants