-
Notifications
You must be signed in to change notification settings - Fork 4.1k
copy: figure out how to reduce kv costs further #90743
Copy link
Copy link
Open
Labels
C-investigationFurther steps needed to qualify. C-label will change.Further steps needed to qualify. C-label will change.T-storageStorage TeamStorage Team
Description
Copy is currently mostly bound by CPU time required to execute kv batch requests. We want to figure out how to reduce this costs so ideas about optimizing the SQL layer (removing datums etc) start to make more sense. Ideas to explore:
- use columnar kv batches instead of row, @nvanbenschoten sez this could "reduce the number of ranges that a given batch touches, so each range would handle larger batches"
- play with bigger batch sizes and kv.transaction.write_pipelining_max_batch_size, why does throughput max out at 100 rows?
- is load based splitting kicking in, if not could we make it kick in sooner?
- can we make a case that multiple concurrent requests should be possible to the same range? possibly only makes sense for reads.
See image for current COPY profile:
- can we do anything about the kvclient side costs which also seem to dominate string parsing and kv encoding (see image)
- like can we do something about the sorting? ie could we split the per-index sorting costs across multiple goroutines? kv encoding costs too?
Jira issue: CRDB-20919
Epic CRDB-18892
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
C-investigationFurther steps needed to qualify. C-label will change.Further steps needed to qualify. C-label will change.T-storageStorage TeamStorage Team
Type
Projects
Status
Backlog

