Bug #74713
openrgw/notifications: Persistent notification queue full even when queue is empty
0%
Description
Per the persistent-notification flow, we first reserve and then commit.
During the reservation step, the code tracks active/pending reservations using urgent_data.reserved_size which is incremented on every call to publish_reserve() here.
That same counter is decremented when the message is committed or aborted. However, the value being decremented does not match what was incremented:
On reserve, the counter is increased by the payload size plus an overhead:
const auto overhead = res_op.entries * QUEUE_ENTRY_OVERHEAD;
On commit/abort, the decrement does not include that overhead.
Over time, this mismatch causes urgent_data.reserved_size to continually grow, eventually triggering “queue full” even when there is still available capacity in the queue.
Updated by Krunal Chheda about 2 months ago · Edited
Looking at the logs spread across 1 day
2026-02-02T18:25:30.503+0000 7f61577f6700 20 <cls> /root/rpmbuild/ceph-19.2.1.squid_release/src/cls/2pc_queue/cls_2pc_queue.cc:209: INFO: cls_2pc_queue_reserve: current reservations: 5065376 (bytes) 2026-02-03T19:51:02.835+0000 7f61577f6700 20 <cls> /root/rpmbuild/ceph-19.2.1.squid_release/src/cls/2pc_queue/cls_2pc_queue.cc:209: INFO: cls_2pc_queue_reserve: current reservations: 5089826 (bytes)
we see urgent_data.reserved_size keeps increasing
in ideal world that value should increase as we reserve and decrease as we commit/abort
but that value is continuously increasing highlighting the problem
the value is increasing by 10 bytes for every reservation, so once 12.8 M operations have been performed on a bucket, the urgent_data.reserved_size number will reach 128M and further writes to bucket will be blocked until the topic is deleted and re-created.
Updated by Upkeep Bot 26 days ago
- Status changed from New to Pending Backport
- Merge Commit set to 5e89aff28c6570f888de14fe56a8c05bbbf3d757
- Fixed In set to v20.3.0-5667-g5e89aff28c
- Upkeep Timestamp set to 2026-02-26T14:52:33+00:00
Updated by Upkeep Bot 26 days ago
- Copied to Backport #75191: squid: rgw/notifications: Persistent notification queue full even when queue is empty added
Updated by Upkeep Bot 26 days ago
- Copied to Backport #75192: tentacle: rgw/notifications: Persistent notification queue full even when queue is empty added