-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: unexpected GC queue activity immediately after DROP #20554
Description
Experimentation notes. I'm running single-node release-1.1 with a tpch.lineitem
(SF 1) table restore. Without changing the TTL, I dropped this table last night.
The "live bytes" fell to ~zero within 30 minutes (i.e., it took 30 minutes for
all keys to be deleted, but not cleared yet) while on disk we're now using 1.7GB
instead of 1.3GB (makes sense since we wrote lots of MVCC tombstones).
What stuck out is that while this was going on, I saw lots of unexpected GC runs
that didn't get to delete data. I initially thought those must have been
triggered by the "intent age" (which spikes as the range deletion puts down many
many intents that are only cleaned up after commit; they're likely visible for
too long and get the replica queued). But what speaks against this theory is
that all night, GC was running in circles, apparently always triggered but never
successful at reducing the score. This strikes me as quite odd and needs more
investigation.
This morning, I changed the TTL to 100s and am seeing steady GC queue activity,
each run clearing out a whole range and making steady progress. Annoyingly, the
consistency checker is also running all the time, which can't help performance.
The GC queue took around 18 minutes to clean up ~1.3 on-disk-data worth of data,
which seems OK. After the run, the data directory stabilized at 200-300MB, which
after an offline-compaction drops to 8MB.
RocksDB seems to be running compactions, since the data directory (at the time
of writing) has dropped to 613MB and within a minute more to 419MB (with some
jittering). Logging output is quiet, memory usage is stable, though I'm sometimes
seeing 25 GC runs logged in the runtime stats which I think is higher than I am
used to seeing (the GC queue is not allocation efficient, so that makes some sense
to me).
Running the experiment again to look specifically into the first part.