Project

General

Profile

Actions

Bug #69604

closed

Enabling async discards against devices with slow discard performance can lead to out-of-space conditions

Added by Joshua Baergen about 1 year ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Target version:
-
% Done:

100%

Source:
Community (dev)
Backport:
squid, reef
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
backport_processed
Fixed In:
v19.3.0-7815-geef1641283
Released In:
v20.2.0~1040
Upkeep Timestamp:
2025-11-01T01:32:16+00:00

Description

The queue length of async discard is not bounded, and thus if OSD operations continually generate discards at a rate faster than the underlying device can handle the OSD can run out of blocks available at the allocator because they're all in the async discard queue.

Restarting the OSD will recover this space as the discard queue gets dropped on restart and the allocation map gets rebuilt on boot.

I have a patch that I will post shortly which addresses this by placing a cap on the number of items that can be queued for async discard at any given time.


Related issues 2 (0 open2 closed)

Copied to bluestore - Backport #70208: reef: Enabling async discards against devices with slow discard performance can lead to out-of-space conditionsResolvedIgor FedotovActions
Copied to bluestore - Backport #70209: squid: Enabling async discards against devices with slow discard performance can lead to out-of-space conditionsResolvedIgor FedotovActions
Actions #1

Updated by Joshua Baergen about 1 year ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 61455
Actions #2

Updated by Adam Kupczyk about 1 year ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to squid, reef
Actions #3

Updated by Upkeep Bot about 1 year ago

  • Copied to Backport #70208: reef: Enabling async discards against devices with slow discard performance can lead to out-of-space conditions added
Actions #4

Updated by Upkeep Bot about 1 year ago

  • Copied to Backport #70209: squid: Enabling async discards against devices with slow discard performance can lead to out-of-space conditions added
Actions #5

Updated by Upkeep Bot about 1 year ago

  • Tags (freeform) set to backport_processed
Actions #6

Updated by Upkeep Bot 9 months ago

  • Merge Commit set to eef16412834f79023726d78401ba7d60242b1a51
  • Fixed In set to v19.3.0-7815-geef16412834
  • Upkeep Timestamp set to 2025-07-08T18:12:19+00:00
Actions #7

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v19.3.0-7815-geef16412834 to v19.3.0-7815-geef16412834f
  • Upkeep Timestamp changed from 2025-07-08T18:12:19+00:00 to 2025-07-14T15:22:31+00:00
Actions #8

Updated by Upkeep Bot 8 months ago

  • Fixed In changed from v19.3.0-7815-geef16412834f to v19.3.0-7815-geef1641283
  • Upkeep Timestamp changed from 2025-07-14T15:22:31+00:00 to 2025-07-14T21:08:31+00:00
Actions #9

Updated by Konstantin Shalygin 7 months ago

  • Status changed from Pending Backport to Resolved
  • Assignee set to Joshua Baergen
  • % Done changed from 0 to 100
  • Source set to Community (dev)
Actions #10

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~1040
  • Upkeep Timestamp changed from 2025-07-14T21:08:31+00:00 to 2025-11-01T01:32:16+00:00
Actions

Also available in: Atom PDF