-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: simple bulk loading in Cockroach very slow (compared to PG) #5981
Description
It's slow to the point where you wonder how you're going to get data into it.
Did the following (on my MacBook).
go install github.com/tschottdorf/goplay/tblgen
rm -rf cockroach-data && cockroach start &
docker run -p 5432:5432 postgres
time (tblgen 10000 | cockroach sql)
real 0m16.712s
user 0m35.817s
sys 0m5.158s
time (tblgen 10000 | psql -h $(docker-machine ip default) -U postgres)
real 0m0.306s
user 0m0.023s
sys 0m0.024s
There ought to be some very low-hanging fruit there. tblgen has a constant that can tune how much we want to batch, but by default it inserts 640 entries per batch, which can't be extremely bad.
The typical trace (on the slower end) looks like this:
11:35:14.461839 . 3 ... node 1
11:35:14.461847 . 8 ... read has no clock uncertainty
11:35:14.462042 . 195 ... executing 2562 requests
11:35:14.462500 . 458 ... read-write path
11:35:14.462531 . 31 ... command queue
11:35:14.475034 . 12502 ... raft
11:35:14.480098 . 5064 ... applying batch
Some which need to renew the leader lease obviously take a little longer. I ran with a (horrible) hack that simply bypasses Raft, which didn't at all help:
real 0m17.260s
user 0m36.141s
sys 0m5.156s
But that's still slow. Might be time to push #5255 to completion for easier diagnosis here.
For testing sustained throughput, probably want
./cockroach zone set .default 'range_max_bytes: 99999999999'
./cockroach zone set .default 'replicas:
- attrs: []
'
for once a range splits, these inserts are not 1PC txns any more and things get really bad (on the other hand, background queues could get problematic with a very large replica).