ccl/sqlproxyccl: add benchmarks for connection copying logic #76429
ccl/sqlproxyccl: add benchmarks for connection copying logic #76429jaylim-crl wants to merge 2 commits intocockroachdb:masterfrom
Conversation
Informs cockroachdb#76000. This commit implements postgres interceptors, namely FrontendInterceptor and BackendInterceptor, as described in the sqlproxy connection migration RFC. These interceptors will be used as building blocks for the forwarder component that we will be adding in a later PR. Since the forwarder component has not been added, a simple proxy test (i.e. TestSimpleProxy) has been added to illustrate how the frontend and backend interceptors can be used within the proxy. Release note: None
Informs cockroachdb#76000. Follow up to cockroachdb#76006. This commit adds a benchmark test for the connection copying logic. ``` go version go1.17.3 linux/amd64 --- goos: linux goarch: amd64 pkg: github.com/cockroachdb/cockroach/pkg/ccl/sqlproxyccl/interceptor cpu: AMD Ryzen 9 5950X 16-Core Processor BenchmarkConnectionCopy/msgCount=10000/msgLen=170/io.Copy-32 385 3538410 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=170/io.CopyN-32 15 75036404 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=170/pgproto3-32 26 42251150 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=170/chunkreader-32 18 65000047 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=170/interceptor-32 33 39125341 ns/op BenchmarkConnectionCopy/msgCount=500/msgLen=320/io.Copy-32 3204 406888 ns/op BenchmarkConnectionCopy/msgCount=500/msgLen=320/io.CopyN-32 410 3123298 ns/op BenchmarkConnectionCopy/msgCount=500/msgLen=320/pgproto3-32 577 2387535 ns/op BenchmarkConnectionCopy/msgCount=500/msgLen=320/chunkreader-32 469 3565173 ns/op BenchmarkConnectionCopy/msgCount=500/msgLen=320/interceptor-32 652 2015079 ns/op BenchmarkConnectionCopy/msgCount=9000/msgLen=2900/io.Copy-32 49 23330567 ns/op BenchmarkConnectionCopy/msgCount=9000/msgLen=2900/io.CopyN-32 18 72003323 ns/op BenchmarkConnectionCopy/msgCount=9000/msgLen=2900/pgproto3-32 15 82500818 ns/op BenchmarkConnectionCopy/msgCount=9000/msgLen=2900/chunkreader-32 18 79832494 ns/op BenchmarkConnectionCopy/msgCount=9000/msgLen=2900/interceptor-32 20 54727023 ns/op BenchmarkConnectionCopy/msgCount=5000/msgLen=30000/io.Copy-32 12 98640876 ns/op BenchmarkConnectionCopy/msgCount=5000/msgLen=30000/io.CopyN-32 10 110690053 ns/op BenchmarkConnectionCopy/msgCount=5000/msgLen=30000/pgproto3-32 6 177894915 ns/op BenchmarkConnectionCopy/msgCount=5000/msgLen=30000/chunkreader-32 9 129588686 ns/op BenchmarkConnectionCopy/msgCount=5000/msgLen=30000/interceptor-32 9 112610362 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=10/io.Copy-32 591 2274817 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=10/io.CopyN-32 25 47465099 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=10/pgproto3-32 58 23077900 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=10/chunkreader-32 38 31459201 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=10/interceptor-32 64 17616468 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=15/io.Copy-32 531 2336896 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=15/io.CopyN-32 30 45135200 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=15/pgproto3-32 51 22100293 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=15/chunkreader-32 49 28931167 ns/op BenchmarkConnectionCopy/msgCount=10000/msgLen=15/interceptor-32 66 15189020 ns/op ``` Release note: None
jaylim-crl
left a comment
There was a problem hiding this comment.
As a reminder, only the second commit includes the benchmark tests. Note that this is an end-to-end benchmark (i.e. with dummy proxy and SQL servers, instead of in-memory ones). If we don't do that, we can't take into account the Go TCP optimization that was in place for io.Copy and friends. The original benchmark did everything in-memory (without sockets), and it wasn't a full representative of workloads in general. Since this is an end-to-end benchmark, it would make sense to compare the approaches with the io.Copy approach as a baseline.
The message counts and lengths were obtained by first measuring the average pgwire message sizes for various workloads over a period of 30s. See https://docs.google.com/spreadsheets/d/1ui7Rkw9zIsEctOIFT8GFoO9aHpvBCJbBlsxg6JN_CmE/edit?usp=sharing for more information.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/ccl/sqlproxyccl/interceptor/bench_test.go, line 223 at r2 (raw file):
// readers are blocked waiting for packets, and we didn't want to make the // connection a context-aware one. os.Exit(1)
I think this is reasonable since this is just a benchmark test unless there are strong opinions.
Code quote:
// Terminate process. We have to do a force kill here because all the
// readers are blocked waiting for packets, and we didn't want to make the
// connection a context-aware one.
os.Exit(1)
jaylim-crl
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson and @rafiss)
pkg/ccl/sqlproxyccl/interceptor/bench_test.go, line 223 at r2 (raw file):
Previously, jaylim-crl (Jay Lim) wrote…
I think this is reasonable since this is just a benchmark test unless there are strong opinions.
Hm, looks like CI is unhappy with this. I'll rework this.
rafiss
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jaylim-crl and @jeffswenson)
pkg/ccl/sqlproxyccl/interceptor/bench_test.go, line 89 at r2 (raw file):
runSql := func() (net.Listener, error) { ln, err := net.Listen("tcp", "0.0.0.0:0")
nit: should this listen in 127.0.0.1?
pkg/ccl/sqlproxyccl/interceptor/bench_test.go, line 95 at r2 (raw file):
err = stopper.RunAsyncTask(ctx, "sql-quiesce", func(ctx context.Context) { <-stopper.ShouldQuiesce() if err := ln.Close(); err != nil && !grpcutil.IsClosedConnection(err) {
nit: should it use netutil.IsClosedConnection instead?
pkg/ccl/sqlproxyccl/interceptor/bench_test.go, line 115 at r2 (raw file):
connCopyClientFn, connCopyServerFn connCopyType, ) (net.Listener, error) { ln, err := net.Listen("tcp", "0.0.0.0:0")
nit: should this listen in 127.0.0.1?
pkg/ccl/sqlproxyccl/interceptor/bench_test.go, line 121 at r2 (raw file):
err = stopper.RunAsyncTask(ctx, "proxy-quiesce", func(ctx context.Context) { <-stopper.ShouldQuiesce() if err := ln.Close(); err != nil && !grpcutil.IsClosedConnection(err) {
nit: should it use netutil.IsClosedConnection instead?
pkg/ccl/sqlproxyccl/interceptor/bench_test.go, line 223 at r2 (raw file):
Previously, jaylim-crl (Jay Lim) wrote…
Hm, looks like CI is unhappy with this. I'll rework this.
I had a similar problem in the test I wrote here:
cockroach/pkg/sql/pgwire/pgwirecancel/cancel_test.go
Lines 125 to 126 in 657c0e1
that's why I added the extra Close call
jaylim-crl
left a comment
There was a problem hiding this comment.
Note that this is still on my mind. I originally wrote this benchmark without the intention of merging back then (and hence the nits). I'll come back to this later once we have the forwarder in.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jaylim-crl and @jeffswenson)
|
I'm going to close this. We've been running the new sqlproxy for many months, and I don't think it's worth to continue benchmarking with the old approach, which wouldn't work with connection migration at all. |
Only review the second commit. The first commit is in the other PR (#76006).
Informs #76000. Follow up to #76006.
This commit adds a benchmark test for the connection copying logic.
Release note: None