Project

General

Profile

Actions

Bug #65130

closed

crimson: crimson-rados did not detect reintroduction of https://tracker.ceph.com/issues/61875

Added by Samuel Just about 2 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Fixed In:
v19.3.0-1574-g469fae1f84
Released In:
v20.2.0~3103
Upkeep Timestamp:
2025-11-01T01:19:19+00:00

Description

https://github.com/ceph/ceph/pull/56376 would have reintroduced https://tracker.ceph.com/issues/61875 as it puts the snap mapper keys back into the pg meta object. Oddly, a teuthology run on that branch which seems to have included tests with both snapshots and osd restarts did not show crashes associated with this regression and at least one case that seems like it should have exercised the relevant code passed. A quick glance over PGLog.cc::FuturizedShardStoreReader doesn't show any changes, so it should have crashed in the final else branch of FuturizedShardStoreLogReader::process_entry at e.decode_with_checksum.

Tasks:
- Confirm that the crimson-rados suite actually combines snapshots with OSD restarts
- Work out why the existing suite didn't fail the above PR
- Amend the tests to cover the gap


Related issues 1 (0 open1 closed)

Related to crimson - Bug #65247: ObjectContext::drop_recovery_read(): Assertion `recovery_read_marker' failed.ClosedMatan Breizman

Actions
Actions #1

Updated by Samuel Just about 2 years ago

Note that the defect addressed by the above PR is going to be handled another way -- this bug is about why the test suite didn't catch this specific regression.

Actions #2

Updated by Samuel Just about 2 years ago

  • Description updated (diff)
Actions #3

Updated by Matan Breizman about 2 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Matan Breizman
  • Pull request ID set to 56511
Actions #4

Updated by Matan Breizman almost 2 years ago

  • Related to Bug #65247: ObjectContext::drop_recovery_read(): Assertion `recovery_read_marker' failed. added
Actions #5

Updated by Matan Breizman almost 2 years ago

Added label: crimson-replicated-recovery to track all the required fixes

https://github.com/ceph/ceph/pulls?q=+is%3Apr+label%3Acrimson-replicated-recovery

Actions #6

Updated by Matan Breizman almost 2 years ago

  • Status changed from Fix Under Review to In Progress
Since it will probably take few iterations before thrash and recovery tests ('default.yaml')
will pass successfully, add anoter 'simple.yaml' which should remain stable.

This tracker can be closed once default.yaml passes successfully.

Actions #7

Updated by Matan Breizman over 1 year ago

  • Status changed from In Progress to Resolved
Actions #8

Updated by Upkeep Bot 9 months ago

  • Merge Commit set to 469fae1f84139ad52858cf51e82fd8b0b20e1d91
  • Fixed In set to v19.3.0-1574-g469fae1f841
  • Upkeep Timestamp set to 2025-07-11T13:51:22+00:00
Actions #9

Updated by Upkeep Bot 9 months ago

  • Fixed In changed from v19.3.0-1574-g469fae1f841 to v19.3.0-1574-g469fae1f84
  • Upkeep Timestamp changed from 2025-07-11T13:51:22+00:00 to 2025-07-14T23:09:59+00:00
Actions #10

Updated by Upkeep Bot 5 months ago

  • Released In set to v20.2.0~3103
  • Upkeep Timestamp changed from 2025-07-14T23:09:59+00:00 to 2025-11-01T01:19:19+00:00
Actions

Also available in: Atom PDF