feat: Logging and fsync delay for retention deletion#27114
feat: Logging and fsync delay for retention deletion#27114devanbenz merged 30 commits intomaster-1.xfrom
Conversation
I suspect this area of the code is where we "hang" during retention policy deletion. It only occurs in very high cardinality dbs ~10 million+. `DeleteSeriesID` holds a mutex lock and does a fsync call. Running this millions of times + contention + disk I/O between various other writers/readers this could potentially go on for days. This PR adds a WARN log to trigger every 24 hours when we're looping through and deleting series.
davidby-influx
left a comment
There was a problem hiding this comment.
I think extracting the call to Flush and making DeleteSeriesID take an iterable construct would be worth testing. The only place it is called outside of a loop is in a test, which could easily be changed to a single element array.
gwossum
left a comment
There was a problem hiding this comment.
Few minor questions, but overall nice job on finding this performance bottleneck!
|
@devanbenz - Please create port issues for this in main-2.x and RR. Add to the epic issue, as well: OSS 2.9.0 cherry-picks |
|
My thoughts was to have |
I kind of think we should keep the code as is since it's been using ForEach for many years. I'm tempted to not try and change the code too much and just move around the flush semantics. I think having DeleteSeriesID as an atomic operation is fine, if I wanted to throw the iterable inside the deletion I would probably implement a |
davidby-influx
left a comment
There was a problem hiding this comment.
There's a big opportunity for another optimization here....
|
Please add to the 2.9.0 epic if you haven't yet. |
pre-allocate slices add a constant for SeriesN in segment flush tests check error returns in tests
davidby-influx
left a comment
There was a problem hiding this comment.
LGTM.
Nice work, should give us a big speed-up for high-cardinality databases.
This area of the code is where we "hang" during retention policy deletion. It only occurs in very high cardinality dbs ~10 million+. DeleteSeriesID holds a mutex lock and does a fsync call. Running this millions of times + contention + disk I/O between various other writers/readers this could potentially go on for days. This PR batches sync operations instead of running a sync during every series deletion op. It also adds additional logging to retention series deletion. (cherry picked from commit c836ac2)
This area of the code is where we "hang" during retention policy deletion. It only occurs in very high cardinality dbs ~10 million+. DeleteSeriesID holds a mutex lock and does a fsync call. Running this millions of times + contention + disk I/O between various other writers/readers this could potentially go on for days. This PR batches sync operations instead of running a sync during every series deletion op. It also adds additional logging to retention series deletion. (cherry picked from commit c836ac2)
* feat: Logging and fsync delay for retention deletion (#27114) This area of the code is where we "hang" during retention policy deletion. It only occurs in very high cardinality dbs ~10 million+. DeleteSeriesID holds a mutex lock and does a fsync call. Running this millions of times + contention + disk I/O between various other writers/readers this could potentially go on for days. This PR batches sync operations instead of running a sync during every series deletion op. It also adds additional logging to retention series deletion. (cherry picked from commit c836ac2)
This area of the code is where we "hang" during retention policy deletion. It only occurs in very high cardinality dbs ~10 million+.
DeleteSeriesIDholds a mutex lock and does a fsync call. Running this millions of times + contention + disk I/O between various other writers/readers this could potentially go on for days.Notes on changes:
After running influxdb with inch running in the background I see the following
That's ~7.5ms per DeleteSeriesID call.
So for ~35 million cardinality it should take approximately ~3 days to run. I would expect it to take longer due to lock contention + higher resource utilization. I also had far less cardinality.
It does appear like the fsync is killing performance
The time is so little I need to adjust my calculations
~2ms for 10k series vs previously it was ~7ms for a single series.
This PR also adds a log to trigger every 10k series when we're looping through and deleting series for better debugging.