-
Notifications
You must be signed in to change notification settings - Fork 4.1k
OOM occurred while running replica GC old version #42531
Description
Describe the problem
cockroach data node crash due to OOM when a replica running GC.
To Reproduce
What did you do? Describe in your own words.
I load data to a table(load like batch insert a large number of rows ) which has auto-increment, after 25hours later, the replica running GC old version for the table, but there are too many version, and when running GC we copy all version into memory. I known auto-increment is not recommended in cockroach, but I think it is a bug.
And I found another problem, when a key has to many version to GC(more than 256KB), it will Increase normal read and write request latency in that key since the latch.
If possible, provide steps to reproduce the behavior:
- Set up CockroachDB cluster ...
- Send SQL ... / CLI command ...
- Look at UI / log file / client app ...
- See error
Expected behavior
A clear and concise description of what you expected to happen.
Additional data / screenshots
If the problem is SQL-related, include a copy of the SQL query and the schema
of the supporting tables.
If a node in your cluster encountered a fatal error, supply the contents of the
log directories (at minimum of the affected node(s), but preferably all nodes).
Note that log files can contain confidential information. Please continue
creating this issue, but contact support@cockroachlabs.com to submit the log
files in private.
If applicable, add screenshots to help explain your problem.
Environment:
- CockroachDB version [e.g. 2.0.x]
- Server OS: [e.g. Linux/Distrib]
- Client app [e.g.
cockroach sql, JDBC, ...]
Additional context
What was the impact?
Add any other context about the problem here.