Skip to content

feat: Defer cleanup for log/index compactions, add debug log (#26511)#26629

Merged
devanbenz merged 1 commit into1.12from
db/cherrypick_defer_log
Jul 25, 2025
Merged

feat: Defer cleanup for log/index compactions, add debug log (#26511)#26629
devanbenz merged 1 commit into1.12from
db/cherrypick_defer_log

Conversation

@devanbenz
Copy link
Copy Markdown

I believe that there is something happening which causes CurrentCompactionN() to always be greater than 0. Thus making Partition.Wait() hang forever.

Taking a look at some profiles where this issue occurs. I'm seeing a consistent one where we're stuck on Partition.Wait()

-----------+-------------------------------------------------------
         1   runtime.gopark
             runtime.chanrecv
             runtime.chanrecv1
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Wait
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Close
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Index).close
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Index).Close
             github.com/influxdata/influxdb/tsdb.(*Shard).closeNoLock
             github.com/influxdata/influxdb/tsdb.(*Shard).Close
             github.com/influxdata/influxdb/tsdb.(*Store).DeleteShard
             github.com/influxdata/influxdb/services/retention.(*Service).DeletionCheck.func3
             github.com/influxdata/influxdb/services/retention.(*Service).DeletionCheck
             github.com/influxdata/influxdb/services/retention.(*Service).run
             github.com/influxdata/influxdb/services/retention.(*Service).Open.func1
-----------+-------------------------------------------------------

Defer'ing compaction count cleanup inside goroutines should help with any hanging current compaction counts.

Modify currentCompactionN to be a sync atomic.

Adding a debug level log within Compaction.Wait() should aid in debugging.

(cherry picked from commit 149fb47)

Closes #

Describe your proposed changes here.

  • I've read the contributing section of the project README.
  • Signed CLA (if not already signed).

I believe that there is something happening which causes CurrentCompactionN() to always be greater than 0. Thus making Partition.Wait() hang forever.

Taking a look at some profiles where this issue occurs. I'm seeing a consistent one where we're stuck on Partition.Wait()
```
-----------+-------------------------------------------------------
         1   runtime.gopark
             runtime.chanrecv
             runtime.chanrecv1
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Wait
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Close
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Index).close
             github.com/influxdata/influxdb/tsdb/index/tsi1.(*Index).Close
             github.com/influxdata/influxdb/tsdb.(*Shard).closeNoLock
             github.com/influxdata/influxdb/tsdb.(*Shard).Close
             github.com/influxdata/influxdb/tsdb.(*Store).DeleteShard
             github.com/influxdata/influxdb/services/retention.(*Service).DeletionCheck.func3
             github.com/influxdata/influxdb/services/retention.(*Service).DeletionCheck
             github.com/influxdata/influxdb/services/retention.(*Service).run
             github.com/influxdata/influxdb/services/retention.(*Service).Open.func1
-----------+-------------------------------------------------------
```

Defer'ing compaction count cleanup inside goroutines should help with any hanging current compaction counts.

Modify currentCompactionN to be a sync atomic.

Adding a debug level log within Compaction.Wait() should aid in debugging.

(cherry picked from commit 149fb47)
Copy link
Copy Markdown
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@devanbenz devanbenz merged commit 6f526cd into 1.12 Jul 25, 2025
9 checks passed
@devanbenz devanbenz deleted the db/cherrypick_defer_log branch July 25, 2025 18:21
devanbenz added a commit that referenced this pull request Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants