Skip to content

storage/concurrency: benchmark for lockTable#44964

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
sumeerbhola:ltbench
Feb 11, 2020
Merged

storage/concurrency: benchmark for lockTable#44964
craig[bot] merged 1 commit intocockroachdb:masterfrom
sumeerbhola:ltbench

Conversation

@sumeerbhola
Copy link
Copy Markdown
Collaborator

Most of the allocations in lockTable are due to the temporary
*lockState created for btree lookup. And the total bytes allocated
is roughly equal to the bytes allocated by
spanlatch.allocGuardAndLatches in the contended benchmarks.
The cpu is mostly in runtime.mcall, in pthread_cond_wait and
pthread_cond_signal.

Release note: None

@sumeerbhola sumeerbhola requested a review from nvb February 11, 2020 15:30
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: this looks really good. Do you mind posting the results to the PR and in the commit message? Ideally, you'd run with -benchmem and then pass the output to benchstat.

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/storage/concurrency/lock_table_test.go, line 998 at r1 (raw file):

&item.Txn.TxnMeta

This will allocate if we don't pass benchWorkItem in as a pointer to this function.

To do this, switch:

go doBenchWork(items[i], env, requestDoneCh)

with

go doBenchWork(&items[i], env, requestDoneCh)

pkg/storage/concurrency/lock_table_test.go, line 1003 at r1 (raw file):

		}
	}
	env.lm.Release(lg)

Was there a reason you're acquiring latches regardless of whether there are any locks to acquire or not?

Most of the allocations in lockTable are due to the temporary
*lockState created for btree lookup. And the total bytes allocated
is roughly equal to the bytes allocated by
spanlatch.allocGuardAndLatches in the contended benchmarks.
The cpu is mostly in runtime.mcall, in pthread_cond_wait and
pthread_cond_signal.

Benchmark output on my machine:
BenchmarkLockTable/groups=1,outstanding=1,read=0/-16         	  200000	     12149 ns/op	    7257 B/op	      34 allocs/op
BenchmarkLockTable/groups=1,outstanding=1,read=1/-16         	  200000	      9288 ns/op	    6294 B/op	      28 allocs/op
BenchmarkLockTable/groups=1,outstanding=1,read=2/-16         	  200000	     10132 ns/op	    5331 B/op	      22 allocs/op
BenchmarkLockTable/groups=1,outstanding=1,read=3/-16         	  200000	      6380 ns/op	    4367 B/op	      16 allocs/op
BenchmarkLockTable/groups=1,outstanding=1,read=4/-16         	  300000	      4207 ns/op	    2501 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=1,read=5/-16         	  300000	      4826 ns/op	    2501 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=2,read=0/-16         	  100000	     19704 ns/op	    9350 B/op	      43 allocs/op
BenchmarkLockTable/groups=1,outstanding=2,read=1/-16         	  100000	     13420 ns/op	    8400 B/op	      37 allocs/op
BenchmarkLockTable/groups=1,outstanding=2,read=2/-16         	  100000	     12233 ns/op	    7427 B/op	      31 allocs/op
BenchmarkLockTable/groups=1,outstanding=2,read=3/-16         	  200000	     11819 ns/op	    6446 B/op	      25 allocs/op
BenchmarkLockTable/groups=1,outstanding=2,read=4/-16         	  500000	      2810 ns/op	    2503 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=2,read=5/-16         	  500000	      2796 ns/op	    2503 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=4,read=0/-16         	  100000	     18419 ns/op	    9379 B/op	      43 allocs/op
BenchmarkLockTable/groups=1,outstanding=4,read=1/-16         	  100000	     14616 ns/op	    8402 B/op	      37 allocs/op
BenchmarkLockTable/groups=1,outstanding=4,read=2/-16         	  100000	     12598 ns/op	    7430 B/op	      31 allocs/op
BenchmarkLockTable/groups=1,outstanding=4,read=3/-16         	  200000	     11091 ns/op	    6463 B/op	      25 allocs/op
BenchmarkLockTable/groups=1,outstanding=4,read=4/-16         	 1000000	      1964 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=4,read=5/-16         	 1000000	      2079 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=8,read=0/-16         	  100000	     16523 ns/op	    9362 B/op	      43 allocs/op
BenchmarkLockTable/groups=1,outstanding=8,read=1/-16         	  100000	     15131 ns/op	    8395 B/op	      37 allocs/op
BenchmarkLockTable/groups=1,outstanding=8,read=2/-16         	  100000	     14093 ns/op	    7429 B/op	      31 allocs/op
BenchmarkLockTable/groups=1,outstanding=8,read=3/-16         	  100000	     12182 ns/op	    6463 B/op	      25 allocs/op
BenchmarkLockTable/groups=1,outstanding=8,read=4/-16         	 1000000	      1768 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=8,read=5/-16         	 1000000	      2016 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=16,read=0/-16        	  100000	     17909 ns/op	    9357 B/op	      43 allocs/op
BenchmarkLockTable/groups=1,outstanding=16,read=1/-16        	  100000	     15952 ns/op	    8392 B/op	      37 allocs/op
BenchmarkLockTable/groups=1,outstanding=16,read=2/-16        	  100000	     14637 ns/op	    7426 B/op	      31 allocs/op
BenchmarkLockTable/groups=1,outstanding=16,read=3/-16        	  100000	     12950 ns/op	    6461 B/op	      25 allocs/op
BenchmarkLockTable/groups=1,outstanding=16,read=4/-16        	 1000000	      1849 ns/op	    2506 B/op	       9 allocs/op
BenchmarkLockTable/groups=1,outstanding=16,read=5/-16        	 1000000	      1943 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=1,read=0/-16        	  100000	     18541 ns/op	    7316 B/op	      34 allocs/op
BenchmarkLockTable/groups=16,outstanding=1,read=1/-16        	  100000	     14632 ns/op	    6349 B/op	      28 allocs/op
BenchmarkLockTable/groups=16,outstanding=1,read=2/-16        	  200000	     11921 ns/op	    5383 B/op	      22 allocs/op
BenchmarkLockTable/groups=16,outstanding=1,read=3/-16        	  200000	      8337 ns/op	    4407 B/op	      16 allocs/op
BenchmarkLockTable/groups=16,outstanding=1,read=4/-16        	 1000000	      1727 ns/op	    2506 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=1,read=5/-16        	 1000000	      1871 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=2,read=0/-16        	   50000	     27195 ns/op	    9479 B/op	      47 allocs/op
BenchmarkLockTable/groups=16,outstanding=2,read=1/-16        	  100000	     21031 ns/op	    8442 B/op	      40 allocs/op
BenchmarkLockTable/groups=16,outstanding=2,read=2/-16        	  100000	     14650 ns/op	    6569 B/op	      28 allocs/op
BenchmarkLockTable/groups=16,outstanding=2,read=3/-16        	  200000	      9725 ns/op	    4972 B/op	      18 allocs/op
BenchmarkLockTable/groups=16,outstanding=2,read=4/-16        	 1000000	      1858 ns/op	    2506 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=2,read=5/-16        	 1000000	      1887 ns/op	    2506 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=4,read=0/-16        	   50000	     27303 ns/op	    9484 B/op	      47 allocs/op
BenchmarkLockTable/groups=16,outstanding=4,read=1/-16        	  100000	     21513 ns/op	    8502 B/op	      40 allocs/op
BenchmarkLockTable/groups=16,outstanding=4,read=2/-16        	  100000	     16280 ns/op	    7267 B/op	      31 allocs/op
BenchmarkLockTable/groups=16,outstanding=4,read=3/-16        	  100000	     12648 ns/op	    6029 B/op	      23 allocs/op
BenchmarkLockTable/groups=16,outstanding=4,read=4/-16        	 1000000	      1762 ns/op	    2506 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=4,read=5/-16        	 1000000	      1846 ns/op	    2506 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=8,read=0/-16        	   50000	     27690 ns/op	    9493 B/op	      48 allocs/op
BenchmarkLockTable/groups=16,outstanding=8,read=1/-16        	  100000	     22154 ns/op	    8582 B/op	      41 allocs/op
BenchmarkLockTable/groups=16,outstanding=8,read=2/-16        	  100000	     17375 ns/op	    7524 B/op	      32 allocs/op
BenchmarkLockTable/groups=16,outstanding=8,read=3/-16        	  100000	     14593 ns/op	    6631 B/op	      26 allocs/op
BenchmarkLockTable/groups=16,outstanding=8,read=4/-16        	 1000000	      1778 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=8,read=5/-16        	 1000000	      1864 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=16,read=0/-16       	   50000	     34606 ns/op	    9798 B/op	      48 allocs/op
BenchmarkLockTable/groups=16,outstanding=16,read=1/-16       	   50000	     26477 ns/op	    9048 B/op	      42 allocs/op
BenchmarkLockTable/groups=16,outstanding=16,read=2/-16       	  100000	     23607 ns/op	    7821 B/op	      32 allocs/op
BenchmarkLockTable/groups=16,outstanding=16,read=3/-16       	  100000	     20832 ns/op	    7002 B/op	      27 allocs/op
BenchmarkLockTable/groups=16,outstanding=16,read=4/-16       	 1000000	      1905 ns/op	    2505 B/op	       9 allocs/op
BenchmarkLockTable/groups=16,outstanding=16,read=5/-16       	 1000000	      2007 ns/op	    2504 B/op	       9 allocs/op

Release note: None
Copy link
Copy Markdown
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind posting the results to the PR and in the commit message? Ideally, you'd run with -benchmem and then pass the output to benchstat.

Done. There isn't any before number to compare with using benchstat.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)


pkg/storage/concurrency/lock_table_test.go, line 998 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
&item.Txn.TxnMeta

This will allocate if we don't pass benchWorkItem in as a pointer to this function.

To do this, switch:

go doBenchWork(items[i], env, requestDoneCh)

with

go doBenchWork(&items[i], env, requestDoneCh)

Done.
How do I get escape analysis output for the test itself? When I do go build -gcflags "-m -m" it does not produce anything for lock_table_test.go.


pkg/storage/concurrency/lock_table_test.go, line 1003 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Was there a reason you're acquiring latches regardless of whether there are any locks to acquire or not?

Oversight. Fixed.

Copy link
Copy Markdown
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)


pkg/storage/concurrency/lock_table_test.go, line 1112 at r2 (raw file):

// RunParallel() -- it doesn't seem possible to get parallelism between these
// two values when using B.RunParallel() since B.SetParallelism() accepts an
// integer multiplier to GOMAXPROCS.

btw, do we have an existing test framework to do better than this?
I was first going to vary the concurrency in smaller increments and have a shared variable incremented by the goroutine for each group and when one of them detects the total count has been reached it would signal (via closing a shared channel) to all the other groups. It was more code so I did the simpler thing here.

@sumeerbhola
Copy link
Copy Markdown
Collaborator Author

bors r+

craig bot pushed a commit that referenced this pull request Feb 11, 2020
44964: storage/concurrency: benchmark for lockTable r=sumeerbhola a=sumeerbhola

Most of the allocations in lockTable are due to the temporary
*lockState created for btree lookup. And the total bytes allocated
is roughly equal to the bytes allocated by
spanlatch.allocGuardAndLatches in the contended benchmarks.
The cpu is mostly in runtime.mcall, in pthread_cond_wait and
pthread_cond_signal.

Release note: None

Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

There isn't any before number to compare with using benchstat.

That's fine, it is still useful with only a single point of comparison because it aggregates across trials.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/storage/concurrency/lock_table_test.go, line 998 at r1 (raw file):

How do I get escape analysis output for the test itself? When I do go build -gcflags "-m -m" it does not produce anything for lock_table_test.go.

go test -gcflags "-m -m" should give you what you want, but I've never used that so I might be mistaken.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/storage/concurrency/lock_table_test.go, line 1112 at r2 (raw file):

Previously, sumeerbhola wrote…

btw, do we have an existing test framework to do better than this?
I was first going to vary the concurrency in smaller increments and have a shared variable incremented by the goroutine for each group and when one of them detects the total count has been reached it would signal (via closing a shared channel) to all the other groups. It was more code so I did the simpler thing here.

I'm not aware of any test framework for doing this. We generally don't use B.RunParallel, in part because of the integer limitation you pointed out here. Instead, we usually just do our own thing with WaitGroups and a manual division of work between each goroutine.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 11, 2020

Build succeeded

@craig craig bot merged commit 046fa89 into cockroachdb:master Feb 11, 2020
@sumeerbhola sumeerbhola deleted the ltbench branch February 12, 2020 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants