Skip to content

fix: track max sequence globally in compact() to prevent level regression#27231

Merged
davidby-influx merged 3 commits intomaster-1.xfrom
DSB/max_seq_fix
Feb 19, 2026
Merged

fix: track max sequence globally in compact() to prevent level regression#27231
davidby-influx merged 3 commits intomaster-1.xfrom
DSB/max_seq_fix

Conversation

@davidby-influx
Copy link
Copy Markdown
Contributor

@davidby-influx davidby-influx commented Feb 18, 2026

compact() derives the output filename from maxGeneration and
maxSequence+1, but maxSequence was scoped to the max generation only.
When cold or forced compaction groups L1 files (high gen, seq=1) with L4 files
(low gen, seq>=4), the output gets seq=2, regressing L4 data to L2. If large
L4 files are skipped by the cold planner's size check, the regressed file
becomes stranded, requiring repeated cold compaction cycles (12+ hours) to
recover.

The fix tracks the minimum sequence per generation to determine each
generation's true level, then bumps maxSequence so the output level never
regresses. Per-generation max sequence is preserved for collision avoidance,
and min-sequence level detection avoids inflating the output when a generation
has split files from the 2GB size limit.

Test plan

  • TestCompactor_CompactFull_MixedLevelNoRegression: L4+L1 compaction
    produces L4 output, not L2
  • TestCompactor_CompactFull_SplitFilesNoLevelInflation: split L2
    generation compacted with normal L2 produces L3, not inflated to L4

Fixes #27230

…sion

The compaction planner can produce groups mixing L1 and L4
files via cold compaction or the nested-file handling
added in d1d68d6. compact() computed the output sequence
scoped to the max generation only, so when a newer L1 file
(high gen, seq=1) was compacted with older L4 files
(low gen, seq≥4), the output was written at level 2,
regressing already-optimized data.

Track maxSequence across all input generations so the output
level is always at least as high as the highest input level.
This is safe for all existing compaction paths: PlanLevel
inputs are same-level (global max == per-generation max),
and cold/forced compactions merge into a single generation
where the higher sequence is harmless.

Fixes #27230
@davidby-influx
Copy link
Copy Markdown
Contributor Author

Claude is rethinking its life choices... Stand by.

compact() derives the output filename from maxGeneration and
maxSequence+1, but maxSequence was scoped to the max generation only.
When cold or forced compaction groups L1 files (high gen, seq=1) with
L4 files (low gen, seq>=4), the output gets seq=2 — regressing L4
data to L2.

Track the minimum sequence per generation to determine each
generation's true level, then bump maxSequence so the output level
never regresses below the highest input level. Per-generation max
sequence is preserved for collision avoidance, and min-sequence level
detection avoids inflating the output when a generation has split
files from the 2GB size limit.

Fixes #27230
@davidby-influx davidby-influx marked this pull request as ready for review February 19, 2026 01:43
@davidby-influx
Copy link
Copy Markdown
Contributor Author

davidby-influx commented Feb 19, 2026

Claude says:

 Here are my re-verified conclusions:                                                                                                                                                  
                                                                                                                                                                                        
  1. Which paths produce mixed-level groups?                                                                                                                                            
                                                                                                                                                                                        
  Only the cold/forced Plan() path (line 493). It collects all non-in-use, non-skipped files regardless of level.                                                                       
                                                                                                                                                                                        
  - PlanLevel(1,2,3): Groups by level. The orphan absorption in groupAdjacentGenerations can mix levels within a group, but PlanLevel filters groups by level() (which returns the max),
   so orphaned lower-level files get absorbed into higher-level groups that PlanLevel at the lower level never sees. PlanLevel always produces same-level groups. No mixed levels.
  - Hot Plan(): The reverse scan finds the last L4 by generation number. New L1 files always have higher generation numbers than L4 files, so they're excluded. No mixed levels in
  normal operation.
  - PlanOptimize(): Operates on cold shards after Plan(). Not relevant.

  2. The bug predates both commits. The per-generation maxSequence logic in compact() has been the same since before f0d5c73 and d1d68d6. The cold/forced path has always been able to
  mix levels. Neither commit introduced or extended the bug.

  3. The fix is correct. I traced through seven scenarios:

  ┌────────────────────┬────────────────────────────┬────────────┬────────────┬──────────────┐
  │      Scenario      │           Input            │ Fix output │ Old output │   Correct?   │
  ├────────────────────┼────────────────────────────┼────────────┼────────────┼──────────────┤
  │ L4+L1 cold         │ gen=1(s4), gen=4(s1)       │ seq=4 (L4) │ seq=2 (L2) │ Fix ✓, Old ✗ │
  ├────────────────────┼────────────────────────────┼────────────┼────────────┼──────────────┤
  │ All L4 cold        │ gen=1(s4), gen=4(s4)       │ seq=5 (L4) │ seq=5 (L4) │ Both ✓       │
  ├────────────────────┼────────────────────────────┼────────────┼────────────┼──────────────┤
  │ All L1 PlanLevel   │ gen=1..8(s1)               │ seq=2 (L2) │ seq=2 (L2) │ Both ✓       │
  ├────────────────────┼────────────────────────────┼────────────┼────────────┼──────────────┤
  │ L2 split PlanLevel │ gen=1(s2,s3), gen=16(s2)   │ seq=3 (L3) │ seq=3 (L3) │ Both ✓       │
  ├────────────────────┼────────────────────────────┼────────────┼────────────┼──────────────┤
  │ L3+L1 cold         │ gen=1(s3), gen=2(s1)       │ seq=3 (L3) │ seq=2 (L2) │ Fix ✓, Old ✗ │
  ├────────────────────┼────────────────────────────┼────────────┼────────────┼──────────────┤
  │ L4 split+L1 cold   │ gen=1(s4,s5,s6), gen=4(s1) │ seq=4 (L4) │ seq=2 (L2) │ Fix ✓, Old ✗ │
  ├────────────────────┼────────────────────────────┼────────────┼────────────┼──────────────┤
  │ L2 split+L1 cold   │ gen=1(s2,s3,s4), gen=4(s1) │ seq=2 (L2) │ seq=2 (L2) │ Both ✓       │
  └────────────────────┴────────────────────────────┴────────────┴────────────┴──────────────┘

  The level bump loop's result is order-independent (map iteration) because it only increases maxSequence, never decreases it.

  4. The tests are correct.
  - MixedLevelNoRegression: catches the bug (old code produces seq=2, fix produces seq=4)
  - SplitFilesNoLevelInflation: catches over-correction (global-max fix would produce seq=4, correct fix produces seq=3, old code also produces seq=3)

  5. One prior claim was wrong. I incorrectly attributed hot-path exposure to d1d68d6. The hot path does not produce mixed-level groups. The bug is exclusively in the cold/forced
  compaction path and has been latent since the original implementation.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug in TSM file compaction where level 4 (L4) files could be regressed to level 2 (L2) during cold or forced compaction operations. The issue occurred when compact() determined the output filename using only the maximum sequence number within the maximum generation, ignoring higher-level files from older generations. This could cause already-optimized L4 data to be written at a lower level, potentially stranding files and requiring multiple 12+ hour compaction cycles to recover.

Changes:

  • Modified compact() to track minimum sequence per generation and ensure output level never regresses below the highest input level
  • Added comprehensive tests covering mixed-level compaction and split-file edge cases

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
tsdb/engine/tsm1/compact.go Added minSeqByGen map to track minimum sequence per generation and logic to bump maxSequence to prevent level regression
tsdb/engine/tsm1/compact_test.go Added mustWriteTSMWithSeq helper and two tests: one for mixed-level regression prevention, one for split-file level inflation prevention

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1014 to +1021
for _, minSeq := range minSeqByGen {
level := minSeq
if level > 4 {
level = 4
}
if maxSequence+1 < level {
maxSequence = level - 1
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the logic here somewhat confusing. I've created a branch with an alternative implementation that gets the same result, but makes it clear what is being calculated.

For your consideration:
https://github.com/influxdata/influxdb/tree/gw/max_seq_fix_minInputLevel

Diff:
https://github.com/influxdata/influxdb/compare/DSB/max_seq_fix..gw/max_seq_fix_minInputLevel

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude bows to your superiority:

The gw branch computes maxInputLevel as a single value and applies the bump in one comparison. My branch iterates the map and bumps incrementally inside the loop. Let me verify      
  they're equivalent.
                                                                                                                                                                                        
  gw branch:                                                                                                                                                                          
  var maxInputLevel int                                                                                                                                                                 
  for _, minSeq := range minSeqByGen {                                                                                                                                                  
      maxInputLevel = max(maxInputLevel, min(minSeq, 4))
  }
  if maxSequence+1 < maxInputLevel {
      maxSequence = maxInputLevel - 1
  }

  This branch:
  for _, minSeq := range minSeqByGen {
      level := minSeq
      if level > 4 {
          level = 4
      }
      if maxSequence+1 < level {
          maxSequence = level - 1
      }
  }

  My loop bumps maxSequence incrementally per generation. Since the bump only increases maxSequence, after the full loop, maxSequence ends at max(original, maxLevel-1) where maxLevel
  is the highest level seen. That's the same as computing maxInputLevel first and then bumping once.

  They produce identical results for all inputs. The gw branch is cleaner — it separates the computation from the mutation and avoids redundant comparisons inside the loop.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopting your changes.

Explicitly calcluate maxInputLevel for clarity.

(cherry picked from commit 142fcaa)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@gwossum gwossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidby-influx davidby-influx merged commit 381fd2d into master-1.x Feb 19, 2026
12 of 13 checks passed
@davidby-influx davidby-influx deleted the DSB/max_seq_fix branch February 19, 2026 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

compact() regresses L4 files to L2 when compacting mixed-level groups

3 participants