-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: leak in root SQL bytes monitor/account #95324
Copy link
Copy link
Closed
Labels
C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.S-2Medium-high impact: many users impacted, risks of availability and difficult-to-fix data errorsMedium-high impact: many users impacted, risks of availability and difficult-to-fix data errorsT-sql-queriesSQL Queries TeamSQL Queries Team
Description
Describe the problem
Was running a write-heavy workload against a cluster, and eventually it ran into "memory budget exceeded" errors coming out of our memory monitoring infrastructure. Eventually the throughput completely collapses, so marking it as an S-2.
To Reproduce
Using 76e3ea8:
roachprod put $CLUSTER:10 $CRDB/artifacts/cockroach ./cockroach
roachprod run $CLUSTER:10 -- ./cockroach workload init kv --splits 5000 (roachprod pgurl $CLUSTER:1)
roachprod run $CLUSTER:10 -- ./cockroach workload run kv --min-block-bytes=262144 --max-block-bytes=262144 \
--concurrency=50 --splits=5000 --read-percent=0 \
--max-rate=150 --ramp 1m --duration 1h \
--tolerate-errors (roachprod pgurl $CLUSTER:1-9)
Where after a while, the workload generator starts tripping up with the following:
843.0s 0 148.9 150.0 10.5 12.6 14.2 15.7 write
844.0s 0 151.0 150.0 11.0 14.2 18.9 21.0 write
845.0s 0 150.0 150.0 10.5 14.2 15.2 21.0 write
846.0s 0 150.1 150.0 10.5 13.6 17.8 18.9 write
847.0s 0 150.0 150.0 10.5 13.6 15.7 15.7 write
848.0s 0 149.9 150.0 10.5 14.2 16.8 19.9 write
849.0s 0 150.1 150.0 11.0 14.7 67.1 142.6 write
850.0s 0 150.1 150.0 11.0 13.6 17.8 28.3 write
851.0s 0 150.0 150.0 11.0 14.2 16.3 22.0 write
852.0s 0 150.0 150.0 11.5 15.2 16.8 18.9 write
853.0s 0 149.9 150.0 11.0 14.7 18.9 21.0 write
854.0s 0 150.0 150.0 10.5 13.1 19.9 19.9 write
855.0s 0 150.0 150.0 10.5 14.2 46.1 62.9 write
I230116 20:06:08.350421 258 workload/pgx_helpers.go:79 [-] 3 pgx logger [error]: Exec logParams=map[args:[-3254532635867646659 4c1b9ff64253350810d57807dd2c134f16ca3ffb73861498c871c2436ae912b5353f3fc6dc06a8aef7d8681b27d87f35ed68a1f1caefa80f082c1bb73ca9eed9 (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997399040 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:2689528 sql:kv-2 time:4.360028ms]
E230116 20:06:08.350626 1 workload/cli/run.go:546 [-] 4 ERROR: root: memory budget exceeded: 245760 bytes requested, 7997399040 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200)
I230116 20:06:08.396573 270 workload/pgx_helpers.go:79 [-] 5 pgx logger [error]: Exec logParams=map[args:[3139555668903684332 aadd31241468115cbc8b7a33050c2c992b97ccfc046a1f4997eccaff5db827c0ff630bab707f44c7ec441fdb9c758682da0cee83195e54e91217b4964df82bee (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997327360 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:9659884 sql:kv-2 time:3.732324ms]
I230116 20:06:08.409964 257 workload/pgx_helpers.go:79 [-] 6 pgx logger [error]: Exec logParams=map[args:[-4919977530161583301 3d274121bc7a59bec1da3d9a11270ecf60a40f2d6542d1cd6c624bf35c50ab2439df3e4a4fd0fd3d9d2b93b674e530a53f3eed08e45f2d6585f98a10fb3fbb92 (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997399040 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:2689528 sql:kv-2 time:3.907051ms]
I230116 20:06:08.459162 271 workload/pgx_helpers.go:79 [-] 7 pgx logger [error]: Exec logParams=map[args:[8383264568401713769 ceb2976f5150b90dd0ecab40bb4d45a36bcc76e1a0fa06ef48463dc4a816c3357825d6349d79521230c149d8e7e6a704f57963346d9b55ff6b642eae73035216 (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997327360 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:9659884 sql:kv-2 time:6.312705ms]
I230116 20:06:08.470333 242 workload/pgx_helpers.go:79 [-] 8 pgx logger [error]: Exec logParams=map[args:[-1869484712175301760 11ec9ff410bcab8fb72b5ab3e680329b7804e4c62f0a7cb9d3a046f36b659553c2d93be44a57377111b0083538b44b3cc2897fa85eccc2287b051f41745af16b (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997399040 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:2689528 sql:kv-2 time:3.82225ms]
856.0s 5 150.1 150.0 10.5 13.1 19.9 21.0 write
I230116 20:06:08.516808 272 workload/pgx_helpers.go:79 [-] 9 pgx logger [error]: Exec logParams=map[args:[-1239462385144657077 967ec05c376c0ab6cb0ad883fa9b55de86d3f6ec44fc54201d2811dc8fd2bf0979b05ba3a7f9048391e976d957a8bcdf835d75857880022afd80d3113096bd53 (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997327360 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:9659884 sql:kv-2 time:3.977102ms]
I230116 20:06:08.531356 243 workload/pgx_helpers.go:79 [-] 10 pgx logger [error]: Exec logParams=map[args:[4977677096930467774 439f0d2bfdb948ff4bae63c89c41e8914f583341b7dcb8d98fd997114b2437cb0671b58fbc1d7737e60133e216bb8a970203234f40405d8b548bca2ba0558b8b (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997399040 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:2689528 sql:kv-2 time:5.16205ms]
I230116 20:06:08.577092 256 workload/pgx_helpers.go:79 [-] 11 pgx logger [error]: Exec logParams=map[args:[-5650848376985580255 d31b31496c032afa7f691d56c50c6088748835534c5b1e8eff2efa0c7e68714a585d721cfeacb0ff4edc37b2df4e73f1ca9d705228eb7497b5155abdcf0f837f (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997327360 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:9659884 sql:kv-2 time:3.670825ms]
I230116 20:06:08.590932 95 workload/pgx_helpers.go:79 [-] 12 pgx logger [error]: Exec logParams=map[args:[-4134403631077663780 dd90037c1ca751308cb73ba893be96d9ae4298304eb545cb48537ca9998c251a3eb6ed3099a7bd91caaecd1efaecf87d2acdd29d4e99d5c40c0c9f0fdd9651b7 (truncated 262080 bytes)] err:ERROR: root: memory budget exceeded: 245760 bytes requested, 7997399040 currently allocated, 7997484032 bytes in budget (SQLSTATE 53200) pid:2689528 sql:kv-2 time:4.447271ms]
The metrics suggest that we're never free-ing up the memory budget acquired. Do we have a leak? The red annotations, in order:
- Workload started.
- Memory usage stopped growing, also where we start running into errors. Throughput starts collapsing but bytes in stays steady.
- Workload stopped. Bytes in will go to zero.
Jira issue: CRDB-23459
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.S-2Medium-high impact: many users impacted, risks of availability and difficult-to-fix data errorsMedium-high impact: many users impacted, risks of availability and difficult-to-fix data errorsT-sql-queriesSQL Queries TeamSQL Queries Team
Type
Projects
Status
Done
