copy: figure out how to reduce kv costs further

Copy is currently mostly bound by CPU time required to execute kv batch requests.  We want to figure out how to reduce this costs so ideas about optimizing the SQL layer (removing datums etc) start to make more sense.   Ideas to explore:

- use columnar kv batches instead of row, @nvanbenschoten sez this could "reduce the number of ranges that a given batch touches, so each range would handle larger batches"
- play with bigger batch sizes and kv.transaction.write_pipelining_max_batch_size, why does throughput max out at 100 rows?
- is load based splitting kicking in, if not could we make it kick in sooner?
- can we make a case that multiple concurrent requests should be possible to the same range?   possibly only makes sense for reads.

See image for current COPY profile:

<img width="1698" alt="Screen Shot 2022-10-26 at 8 07 08 PM" src="https://user-images.githubusercontent.com/80277990/198161277-66e0a04c-2f77-432a-8027-16f2e5071db4.png">

- can we do anything about the kvclient side costs which also seem to dominate string parsing and kv encoding (see image)
- like can we do something about the sorting?  ie could we split the per-index sorting costs across multiple goroutines?  kv encoding costs too?

<img width="1698" alt="Screen Shot 2022-10-26 at 8 03 51 PM" src="https://user-images.githubusercontent.com/80277990/198161072-9fd59462-013f-43ae-b5a5-6c0c270608af.png">


Jira issue: CRDB-20919

Epic CRDB-18892

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

copy: figure out how to reduce kv costs further #90743

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

copy: figure out how to reduce kv costs further #90743

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions