Skip to content

Do not clear pending if points to the checkpoint#602

Merged
michaelklishin merged 1 commit intorabbitmq:mainfrom
deadtrickster:patch-1
Mar 26, 2026
Merged

Do not clear pending if points to the checkpoint#602
michaelklishin merged 1 commit intorabbitmq:mainfrom
deadtrickster:patch-1

Conversation

@deadtrickster
Copy link
Copy Markdown
Contributor

With Rav3 Tanzu DQ's log writer errors as soon as snapshot is requested. For example,

2026-03-25 22:50:44.857425+01:00 [warning] <0.1364.0> segment_writer: missing index 313868 in mem table for uid DQ_2F_57D0UL0CNGN2start index 313868 checking to see if UId has been unregistered
2026-03-25 22:50:44.857523+01:00 [error] <0.1364.0> segment_writer: uid <redacted> is registered, exiting...
2026-03-25 22:50:44.857629+01:00 [error] <0.1341.0> segment_writer: 1 failures encountered during segment flush. Errors: [{{<<"redacted">>,
2026-03-25 22:50:44.857629+01:00 [error] <0.1341.0>                                                                         [{#Ref<0.588533474.2887385091.231957>,
2026-03-25 22:50:44.857629+01:00 [error] <0.1341.0>                                                                           [{313868,
2026-03-25 22:50:44.857629+01:00 [error] <0.1341.0>                                                                             699049}]}]},
2026-03-25 22:50:44.857629+01:00 [error] <0.1341.0>                                                                        {missing_index,
2026-03-25 22:50:44.857629+01:00 [error] <0.1341.0>                                                                         <<"redacted">>,
2026-03-25 22:50:44.857629+01:00 [error] <0.1341.0>                                                                         313868}}]

initially I thought it is due to the [] being the default for live indexes, but looks like it is due to some race here, in complete_snapshot.

When a new checkpoint write is in progress and release_cursor fires promote_checkpoint, the pending field gets overwritten, and the subsequent checkpoint completion clobbers the snapshot's pending state.
@michaelklishin michaelklishin added this to the 3.1.1 milestone Mar 25, 2026
@michaelklishin michaelklishin merged commit 36233c1 into rabbitmq:main Mar 26, 2026
7 checks passed
@michaelklishin
Copy link
Copy Markdown
Contributor

jimsynz pushed a commit to jimsynz/neonfs that referenced this pull request Mar 29, 2026
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [ra](https://hex.pm/packages/ra) ([source](https://github.com/rabbitmq/ra)) | prod | patch | `3.1.0` → `3.1.2` |

---

### Release Notes

<details>
<summary>rabbitmq/ra (ra)</summary>

### [`v3.1.2`](https://github.com/rabbitmq/ra/releases/tag/v3.1.2)

[Compare Source](rabbitmq/ra@v3.1.1...v3.1.2)

#### What's Changed

- `gen_batch_server` was bumped to `0.9.2` to significantly reduce the risk of an [OOM scenario](rabbitmq/gen-batch-server#27) that affects Ra-based systems

**Full Changelog**: <rabbitmq/ra@v3.1.1...v3.1.2>

### [`v3.1.1`](https://github.com/rabbitmq/ra/releases/tag/v3.1.1)

[Compare Source](rabbitmq/ra@v3.1.0...v3.1.1)

#### What's Changed

- Use the Unicode translation modifier to log server IDs and cluster names by [@&#8203;the-mikedavis](https://github.com/the-mikedavis) in [#&#8203;599](rabbitmq/ra#599)
- Export `ra:membership()` type by [@&#8203;the-mikedavis](https://github.com/the-mikedavis) in [#&#8203;603](rabbitmq/ra#603)
- Do not clear pending if points to the checkpoint by [@&#8203;deadtrickster](https://github.com/deadtrickster) in [#&#8203;602](rabbitmq/ra#602)
- Fix doubly-wrapped log entries after sparse write and recovery by [@&#8203;ansd](https://github.com/ansd) in [#&#8203;601](rabbitmq/ra#601)

**Full Changelog**: <rabbitmq/ra@v3.1.0...v3.1.1>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) in timezone Pacific/Auckland, Automerge - Between 12:00 AM and 03:59 AM ( * 0-3 * * * ) in timezone Pacific/Auckland.

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My45MC4xIiwidXBkYXRlZEluVmVyIjoiNDMuOTAuMSIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsicmVub3ZhdGUiXX0=-->

Reviewed-on: https://harton.dev/project-neon/neonfs/pulls/84
Co-authored-by: Renovate Bot <bot@harton.nz>
Co-committed-by: Renovate Bot <bot@harton.nz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants