Skip to content

jobs: bump default progress log time to 30s#25791

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
madelynnblue:loop-time
May 31, 2018
Merged

jobs: bump default progress log time to 30s#25791
craig[bot] merged 1 commit intocockroachdb:masterfrom
madelynnblue:loop-time

Conversation

@madelynnblue
Copy link
Copy Markdown
Contributor

@madelynnblue madelynnblue commented May 22, 2018

The previous code allowed updates to be performed every 1s, which could
cause the MVCC row to be very large causing problems with splits. We
can update much more slowly by default. In the case of a small backup
job, the 5% fraction threshold will allow a speedier update rate.

Remove a note that's not useful anymore since the referred function
can now only be used in the described safe way.

See #25770. Although this change didn't fix that bug, we still think
it's a good idea.

Release note: None

@madelynnblue madelynnblue requested review from a team, danhhz and dt May 22, 2018 02:21
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@nvb
Copy link
Copy Markdown
Contributor

nvb commented May 22, 2018

:lgtm:

I'm assuming that this effectively fixes the workload fixtures make issue in the referenced issue. Did you confirm that it does? If you haven't tried it already, I'd recommend setting --warehouses=15000 to avoid colliding with the existing 10k fixture.


Review status: 0 of 1 files reviewed at latest revision, all discussions resolved.


pkg/sql/jobs/progress.go, line 26 at r1 (raw file):

// progressFractionThreshold.
var (
	progressTimeThreshold             = time.Second * 30

nit: 30 * time.Second


Comments from Reviewable

@madelynnblue
Copy link
Copy Markdown
Contributor Author

@nvanbenschoten how long does it take for that workload command to fail? it's not going quickly for me in my tests so far when i'm verifying that master fails.

@nvb
Copy link
Copy Markdown
Contributor

nvb commented May 23, 2018

It took around 20 minutes to fail before, but I'd let the command finish successfully (~3 hours) before concluding that this fixes the issue completely.

@madelynnblue
Copy link
Copy Markdown
Contributor Author

This didn't appear to fix the problem. I'm going to work on a test that can reproduce this faster.

@nvb
Copy link
Copy Markdown
Contributor

nvb commented May 23, 2018

You could try dropping the range size so that the row doesn't need to grow as large to trigger the error.

The previous code allowed updates to be performed every 1s, which could
cause the MVCC row to be very large causing problems with splits. We
can update much more slowly by default. In the case of a small backup
job, the 5% fraction threshold will allow a speedier update rate.

Remove a note that's not useful anymore since the referred function
can now only be used in the described safe way.

See #25770. Although this change didn't fix that bug, we still think
it's a good idea.

Release note: None
@madelynnblue
Copy link
Copy Markdown
Contributor Author

I've removed the Fixes line so that this PR no longer closes the original bug. But I still think this is a good idea to merge anyway.

@madelynnblue
Copy link
Copy Markdown
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request May 31, 2018
25014: storage: queue requests to push txn / resolve intents on single keys r=spencerkimball a=spencerkimball

Previously, high contention on a single key would cause every thread to
push the same conflicting transaction then resolve the same intent in
parallel. This is inefficient as only one pusher needs to succeed, and
only one resolver needs to resolve the intent, and then only one writer
should proceed while the other readers/writers should in turn wait on
the previous writer by pushing its transaction. This effectively
serializes the conflicting reader/writers.
    
One complication is that all pushers which may have a valid, writing
transaction (i.e., `Transaction.Key != nil`), must push either the
conflicting transaction or another transaction already pushing that
transaction. This allows dependency cycles to be discovered.

Fixes #20448 

25791: jobs: bump default progress log time to 30s r=mjibson a=mjibson

The previous code allowed updates to be performed every 1s, which could
cause the MVCC row to be very large causing problems with splits. We
can update much more slowly by default. In the case of a small backup
job, the 5% fraction threshold will allow a speedier update rate.

Remove a note that's not useful anymore since the referred function
can now only be used in the described safe way.

See #25770. Although this change didn't fix that bug, we still think
it's a good idea.

Release note: None

26293: opt: enable a few distsql logictests r=RaduBerinde a=RaduBerinde

 - `distsql_indexjoin`: this is only a planning test. Modifying the
   split points and queries a bit to make the condition more
   restrictive and make the optimizer choose index joins. There was a
   single plan that was different, and the difference was minor (the
   old planner is emitting an unnecessary column).

 - `distsql_expr`: logic-only test, enabling for opt.

 - `distsql_scrub`: planning test; opt version commented out for now.

Release note: None

Co-authored-by: Spencer Kimball <spencer.kimball@gmail.com>
Co-authored-by: Matt Jibson <matt.jibson@gmail.com>
Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented May 31, 2018

Build succeeded

@craig craig bot merged commit 1faebfa into cockroachdb:master May 31, 2018
@madelynnblue madelynnblue deleted the loop-time branch May 31, 2018 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants