Conversation
Contributor
|
❗ By default, the pull request is configured to backport to all release branches.
|
Member
Author
|
/test |
Contributor
|
@mnencia, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/20824279808 |
3ffa226 to
34bf9be
Compare
b60ebdc to
6e3957e
Compare
Member
Author
|
/test |
Contributor
|
@mnencia, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/20854827166 |
Member
|
/test |
Contributor
|
@armru, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/20917830755 |
armru
approved these changes
Jan 12, 2026
mnencia
commented
Jan 12, 2026
Member
Author
|
/test |
Contributor
|
@mnencia, here's the link to the E2E on CNPG workflow run: https://github.com/cloudnative-pg/cloudnative-pg/actions/runs/20930738286 |
jbattiato
approved these changes
Jan 19, 2026
…story files Replicas can crash-loop when orphaned "future timeline" .history files exist in the WAL archive. This can occur during split-brain scenarios or other conditions where timeline history files are created for timelines that the cluster never officially adopts. This fix adds validation during WAL restore to prevent replicas from downloading timeline history files with timeline IDs greater than the cluster's current timeline. Primary instances retain full access. The validation works by: - Parsing timeline ID from .history filenames (e.g., 00000022.history) - Checking if the instance is a primary or replica - For replicas, rejecting files where fileTimeline > clusterTimeline - Returning "file not found" to PostgreSQL for rejected files This prevents PostgreSQL from ever seeing the problematic history file, allowing normal recovery to proceed. Combined with PR #9637 (auto-recovery via re-cloning), this provides complete coverage of timeline divergence scenarios. Closes #4188 Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Add comprehensive unit tests for the timeline history file validation logic. Tests cover regular WAL files, invalid filenames, primary behavior, and replica behavior with current, past, and future timelines. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Add E2E test demonstrating protection against future timeline history files through a backup and restore scenario. The test creates two clusters sharing a WAL archive, where cluster 2 creates timeline 2 history files. When cluster 1 scales up, the new replica successfully joins without crash-looping, validating the protection works correctly. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com>
Move the timeline divergence test into its own Context block with its own BeforeAll hook that creates a dedicated namespace and MinIO setup. This allows the test to run independently without requiring the main backup test cluster to remain running, reducing peak resource usage from 5 to 3 PostgreSQL instances during timeline test execution. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
cnpg-bot
pushed a commit
that referenced
this pull request
Jan 20, 2026
…story files (#9650) Replicas can crash-loop when orphaned "future timeline" .history files exist in the WAL archive. This can occur during split-brain scenarios or other conditions where timeline history files are created for timelines that the cluster never officially adopts. This fix adds validation during WAL restore to prevent replicas from downloading timeline history files with timeline IDs greater than the cluster's current timeline. Primary instances retain full access. The validation works by: - Parsing timeline ID from .history filenames (e.g., 00000022.history) - Checking if the instance is a primary or replica - For replicas, rejecting files where fileTimeline > clusterTimeline - Returning "file not found" to PostgreSQL for rejected files This prevents PostgreSQL from ever seeing the problematic history file, allowing normal recovery to proceed. Combined with PR #9637 (auto-recovery via re-cloning), this provides complete coverage of timeline divergence scenarios. Closes #4188 Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com> Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com> (cherry picked from commit ef73994)
cnpg-bot
pushed a commit
that referenced
this pull request
Jan 20, 2026
…story files (#9650) Replicas can crash-loop when orphaned "future timeline" .history files exist in the WAL archive. This can occur during split-brain scenarios or other conditions where timeline history files are created for timelines that the cluster never officially adopts. This fix adds validation during WAL restore to prevent replicas from downloading timeline history files with timeline IDs greater than the cluster's current timeline. Primary instances retain full access. The validation works by: - Parsing timeline ID from .history filenames (e.g., 00000022.history) - Checking if the instance is a primary or replica - For replicas, rejecting files where fileTimeline > clusterTimeline - Returning "file not found" to PostgreSQL for rejected files This prevents PostgreSQL from ever seeing the problematic history file, allowing normal recovery to proceed. Combined with PR #9637 (auto-recovery via re-cloning), this provides complete coverage of timeline divergence scenarios. Closes #4188 Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com> Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com> (cherry picked from commit ef73994)
cnpg-bot
pushed a commit
that referenced
this pull request
Jan 20, 2026
…story files (#9650) Replicas can crash-loop when orphaned "future timeline" .history files exist in the WAL archive. This can occur during split-brain scenarios or other conditions where timeline history files are created for timelines that the cluster never officially adopts. This fix adds validation during WAL restore to prevent replicas from downloading timeline history files with timeline IDs greater than the cluster's current timeline. Primary instances retain full access. The validation works by: - Parsing timeline ID from .history filenames (e.g., 00000022.history) - Checking if the instance is a primary or replica - For replicas, rejecting files where fileTimeline > clusterTimeline - Returning "file not found" to PostgreSQL for rejected files This prevents PostgreSQL from ever seeing the problematic history file, allowing normal recovery to proceed. Combined with PR #9637 (auto-recovery via re-cloning), this provides complete coverage of timeline divergence scenarios. Closes #4188 Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com> Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com> (cherry picked from commit ef73994)
4 tasks
4 tasks
mnencia
added a commit
that referenced
this pull request
Feb 3, 2026
Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
mnencia
added a commit
that referenced
this pull request
Feb 3, 2026
Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
leonardoce
pushed a commit
that referenced
this pull request
Feb 4, 2026
Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
leonardoce
added a commit
that referenced
this pull request
Feb 4, 2026
…9849) Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com>
leonardoce
added a commit
that referenced
this pull request
Feb 4, 2026
…9849) Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> (cherry picked from commit 1044c8e) (cherry picked from commit d0e801ee5392465d8629056c79e58c32b215dcff)
leonardoce
added a commit
that referenced
this pull request
Feb 4, 2026
…9849) Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> (cherry picked from commit 1044c8e) (cherry picked from commit d0e801ee5392465d8629056c79e58c32b215dcff) (cherry picked from commit 9e2e469)
mnencia
added a commit
that referenced
this pull request
Feb 4, 2026
…9849) Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> (cherry picked from commit 1044c8e)
mnencia
added a commit
that referenced
this pull request
Feb 4, 2026
…9849) Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> (cherry picked from commit 1044c8e)
mnencia
added a commit
that referenced
this pull request
Feb 4, 2026
…9849) Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> (cherry picked from commit 1044c8e) (cherry picked from commit d0e801ee5392465d8629056c79e58c32b215dcff)
mnencia
added a commit
that referenced
this pull request
Feb 4, 2026
…9849) Move timeline history file validation to execute before any WAL restore attempt (plugin or in-tree) rather than only for in-tree restores. This prevents replicas from downloading timeline history files with timeline IDs higher than the cluster's current timeline when plugins handle WAL restore. The timeline protection added in #9650 was only applied to in-tree WAL restore, but not to plugin-based restore. This allowed the protection to be completely bypassed when using plugins, causing replicas to download future timeline history files and fail with timeline mismatch errors. Fixes the timeline protection introduced in ef73994. Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> (cherry picked from commit 1044c8e) (cherry picked from commit d0e801ee5392465d8629056c79e58c32b215dcff) (cherry picked from commit 9e2e469)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replicas can crash-loop when orphaned "future timeline" .history files
exist in the WAL archive. This can occur during split-brain scenarios
or other conditions where timeline history files are created for timelines
that the cluster never officially adopts.
This fix adds validation during WAL restore to prevent replicas from
downloading timeline history files with timeline IDs greater than the
cluster's current timeline. Primary instances retain full access.
The validation works by:
This prevents PostgreSQL from ever seeing the problematic history file,
allowing normal recovery to proceed. Combined with PR #9637 (auto-recovery
via re-cloning), this provides complete coverage of timeline divergence
scenarios.
Closes #4188