Skip to content

workload/bank: generate random strings much faster#35441

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
danhhz:workload_bank_fast
Mar 11, 2019
Merged

workload/bank: generate random strings much faster#35441
craig[bot] merged 1 commit intocockroachdb:masterfrom
danhhz:workload_bank_fast

Conversation

@danhhz
Copy link
Copy Markdown
Contributor

@danhhz danhhz commented Mar 6, 2019

Give bank's initial table data generation the same love that tpcc just
got in #35322. We use bank for some of the roachtests that just want a
bunch of data, so it's worth speeding up. The generated "payload" field
has changed a bit (a-zA-Z instead of hex encoded), but it wasn't that
way before for any particular reason, so this should be okay.

In summary:

  • The slowest part of generating random strings is generating the random
    number. Previously, we made a uint64 and threw away most of it to pick
    one character from an alphabet. Instead, pick as many characters as we
    can get from 64 bits: floor(log(math.MaxUint64)/log(len(alpabet)).
  • The current thinking for go2 is to switch to a Permuted Congruential
    Generator (PCG), which is much faster than the current "math/rand"
    generator. It's smaller in memory (128-bit vs 607 64-bit words) and
    much faster to seed. Use the implementation of that proposal that's
    in "golang.org/x/exp/rand"

Benchmark:

name        old time/op    new time/op      delta
InitBank-8    12.6ms ± 1%       0.6ms ± 2%    -95.54%  (p=0.008 n=5+5)

name        old speed      new speed        delta
InitBank-8  8.02MB/s ± 1%  183.44MB/s ± 2%  +2187.28%  (p=0.008 n=5+5)

name        old alloc/op   new alloc/op     delta
InitBank-8    5.88MB ± 0%      0.23MB ± 0%    -96.05%  (p=0.008 n=5+5)

name        old allocs/op  new allocs/op    delta
InitBank-8     9.00k ± 0%       6.00k ± 0%    -33.32%  (p=0.008 n=5+5)

Release note: None

@danhhz danhhz requested a review from tbg March 6, 2019 02:58
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @tbg)


cpu.prof, line 0 at r1 (raw file):
Woops 😄


pkg/workload/bank/bank.go, line 121 at r1 (raw file):

			b.rows,
			func(rowIdx int) []interface{} {
				rng := rand.New(rand.NewSource(b.seed + uint64(rowIdx)))

If we're just using this as a rand.Source, I bet we can remove an allocation per row by changing this to:

rng := rand.NewSource(b.seed + uint64(rowIdx))

Or if we want to go crazy, pull the variable out of this function and just Seed it each time. The second suggestion is only safe if we never call this function concurrently.


pkg/workload/bank/bank.go, line 122 at r1 (raw file):

			func(rowIdx int) []interface{} {
				rng := rand.New(rand.NewSource(b.seed + uint64(rowIdx)))
				payload := make([]byte, b.payloadBytes)

Do we get any speedup from batch allocating this? Something like:

func (b *bank) Tables() []workload.Table {
    var payloadAlloc []byte
    ...
        func(rowIdx int) []interface{} {
            if len(payloadAlloc) < b.payloadBytes {
                payloadAlloc = make([]byte, 64*b.payloadBytes)
            }
            payload = payloadAlloc[:b.payloadBytes]
            payloadAlloc = payloadAlloc[b.payloadBytes:]

Or if we're confident that this function will be called b.rows, just start with payloadAlloc sized to b.rows*b.payloadBytes?

Give bank's initial table data generation the same love that tpcc just
got in cockroachdb#35322. We use bank for some of the roachtests that just want a
bunch of data, so it's worth speeding up. The generated "payload" field
has changed a bit (a-zA-Z instead of hex encoded), but it wasn't that
way before for any particular reason, so this should be okay.

In summary:
- The slowest part of generating random strings is generating the random
  number. Previously, we made a uint64 and threw away most of it to pick
  one character from an alphabet. Instead, pick as many characters as we
  can get from 64 bits: floor(log(math.MaxUint64)/log(len(alpabet)).
- The current thinking for go2 is to switch to a Permuted Congruential
  Generator (PCG), which is much faster than the current "math/rand"
  generator. It's smaller in memory (128-bit vs 607 64-bit words) and
  _much_ faster to seed. Use the implementation of that proposal that's
  in "golang.org/x/exp/rand"

Benchmark:

    name        old time/op    new time/op      delta
    InitBank-8    12.6ms ± 1%       0.6ms ± 2%    -95.54%  (p=0.008 n=5+5)

    name        old speed      new speed        delta
    InitBank-8  8.02MB/s ± 1%  183.44MB/s ± 2%  +2187.28%  (p=0.008 n=5+5)

    name        old alloc/op   new alloc/op     delta
    InitBank-8    5.88MB ± 0%      0.23MB ± 0%    -96.05%  (p=0.008 n=5+5)

    name        old allocs/op  new allocs/op    delta
    InitBank-8     9.00k ± 0%       6.00k ± 0%    -33.32%  (p=0.008 n=5+5)

Release note: None
@danhhz danhhz force-pushed the workload_bank_fast branch from ed4e814 to dc09a4c Compare March 6, 2019 16:11
@danhhz danhhz requested a review from a team March 6, 2019 16:11
Copy link
Copy Markdown
Contributor Author

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)


cpu.prof, line at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Woops 😄

🙈


pkg/workload/bank/bank.go, line 121 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

If we're just using this as a rand.Source, I bet we can remove an allocation per row by changing this to:

rng := rand.NewSource(b.seed + uint64(rowIdx))

Or if we want to go crazy, pull the variable out of this function and just Seed it each time. The second suggestion is only safe if we never call this function concurrently.

Ah, great eye. The reason that method takes a rand.Source in the first place is because when I was doing #35322, I noticed that it was faster when you gave it a *rand.PCGSource instead of a *rand.Rand. But when I tried plumbing the source all the way down, I didn't see any difference in the overall InitTPCC benchmark. It does however make a difference here, both in cpu and allocs! Diff from simply removing the rand.New wrapper:

name        old time/op    new time/op    delta
InitBank-8     642µs ± 1%     561µs ± 2%  -12.68%  (p=0.008 n=5+5)

name        old speed      new speed      delta
InitBank-8   160MB/s ± 1%   183MB/s ± 2%  +14.53%  (p=0.008 n=5+5)

name        old alloc/op   new alloc/op   delta
InitBank-8     280kB ± 0%     232kB ± 0%  -17.13%  (p=0.000 n=4+5)

name        old allocs/op  new allocs/op  delta
InitBank-8     7.00k ± 0%     6.00k ± 0%  -14.28%  (p=0.008 n=5+5)


pkg/workload/bank/bank.go, line 122 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we get any speedup from batch allocating this? Something like:

func (b *bank) Tables() []workload.Table {
    var payloadAlloc []byte
    ...
        func(rowIdx int) []interface{} {
            if len(payloadAlloc) < b.payloadBytes {
                payloadAlloc = make([]byte, 64*b.payloadBytes)
            }
            payload = payloadAlloc[:b.payloadBytes]
            payloadAlloc = payloadAlloc[b.payloadBytes:]

Or if we're confident that this function will be called b.rows, just start with payloadAlloc sized to b.rows*b.payloadBytes?

Those functions have to be concurrency safe. TPCC gets around this by putting them in a pool using bufalloc.BytesAllocator, which is straightforward, but I'll leave for a followup.

(FWIW, BytesAllocator is probably a better idea here than allocating b.rows * b.payloadBytes, since we seem to be using this workload when we need some large sized cluster but don't particularly care what's in it, so it tends to get a combo that means a very large b.rows * b.payloadBytes.)

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 3 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @tbg)

@danhhz
Copy link
Copy Markdown
Contributor Author

danhhz commented Mar 11, 2019

TFTR!

bors r=nvanbenschoten

craig bot pushed a commit that referenced this pull request Mar 11, 2019
35441: workload/bank: generate random strings much faster r=nvanbenschoten a=danhhz

Give bank's initial table data generation the same love that tpcc just
got in #35322. We use bank for some of the roachtests that just want a
bunch of data, so it's worth speeding up. The generated "payload" field
has changed a bit (a-zA-Z instead of hex encoded), but it wasn't that
way before for any particular reason, so this should be okay.

In summary:
- The slowest part of generating random strings is generating the random
  number. Previously, we made a uint64 and threw away most of it to pick
  one character from an alphabet. Instead, pick as many characters as we
  can get from 64 bits: floor(log(math.MaxUint64)/log(len(alpabet)).
- The current thinking for go2 is to switch to a Permuted Congruential
  Generator (PCG), which is much faster than the current "math/rand"
  generator. It's smaller in memory (128-bit vs 607 64-bit words) and
  _much_ faster to seed. Use the implementation of that proposal that's
  in "golang.org/x/exp/rand"

Benchmark:

    name        old time/op    new time/op      delta
    InitBank-8    12.6ms ± 1%       0.6ms ± 2%    -95.54%  (p=0.008 n=5+5)

    name        old speed      new speed        delta
    InitBank-8  8.02MB/s ± 1%  183.44MB/s ± 2%  +2187.28%  (p=0.008 n=5+5)

    name        old alloc/op   new alloc/op     delta
    InitBank-8    5.88MB ± 0%      0.23MB ± 0%    -96.05%  (p=0.008 n=5+5)

    name        old allocs/op  new allocs/op    delta
    InitBank-8     9.00k ± 0%       6.00k ± 0%    -33.32%  (p=0.008 n=5+5)

Release note: None

Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 11, 2019

Build succeeded

@craig craig bot merged commit dc09a4c into cockroachdb:master Mar 11, 2019
@danhhz danhhz deleted the workload_bank_fast branch March 11, 2019 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants