Skip to content

[Bug]: Restore can fail if WAL size is not the default and maxParallel > 1 #8874

@leonardoce

Description

@leonardoce

Is there an existing issue already for this bug?

  • I have searched for an existing issue, and could not find anything. I believe this is a new bug.

I have read the troubleshooting guide

  • I have read the troubleshooting guide and I think this is a new bug.

I am running a supported version of CloudNativePG

  • I have read the troubleshooting guide and I think this is a new bug.

Contact Details

leonardo.cecchi@enterprisedb.com

Version

1.27 (latest patch)

What version of Kubernetes are you using?

1.34

What is your Kubernetes environment?

Self-managed: kind (evaluation)

How did you install the operator?

YAML manifest

What happened?

When restoring WAL files in parallel, the code assumes they are 16MB, and calculates the names of the WALs to download in parallel.

If the WALs are not the default size, it is possible that some of these evaluated names do not exist. This happens if a backup spans across multiple segments. If the restore tries to restore one of these not existing WALs and fails, it will error with a end of WAL reached and Set end-of-wal-stream flag as one of the WAL files to be prefetched was not found. This stops WAL restore.

If the backup spans multiple segments, the backup will be inconsistent, as the next segment is required.

As a workaround, setting maxParallel to 1 works.

See cloudnative-pg/plugin-barman-cloud#603

Cluster resource

Relevant log output

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

Labels

bug 🐛Something isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions