Bug #47838
closedmon/test_mon_osdmap_prune.sh: first_pinned != trim_to
100%
Description
2020-10-12T11:01:32.103 INFO:tasks.workunit.client.0.smithi057.stderr:++ jq .osdmap_first_committed 2020-10-12T11:01:32.104 INFO:tasks.workunit.client.0.smithi057.stderr:+ fc=11 2020-10-12T11:01:32.104 INFO:tasks.workunit.client.0.smithi057.stderr:+ [[ 11 -eq 11 ]] 2020-10-12T11:01:32.105 INFO:tasks.workunit.client.0.smithi057.stderr:+ old_first_pinned=1 2020-10-12T11:01:32.105 INFO:tasks.workunit.client.0.smithi057.stderr:+ old_last_pinned=921 2020-10-12T11:01:32.105 INFO:tasks.workunit.client.0.smithi057.stderr:++ jq .osdmap_manifest.first_pinned 2020-10-12T11:01:32.105 INFO:tasks.workunit.client.0.smithi057.stderr:+ first_pinned=1 2020-10-12T11:01:32.105 INFO:tasks.workunit.client.0.smithi057.stderr:++ jq .osdmap_manifest.last_pinned 2020-10-12T11:01:32.105 INFO:tasks.workunit.client.0.smithi057.stderr:+ last_pinned=931 2020-10-12T11:01:32.105 INFO:tasks.workunit.client.0.smithi057.stderr:+ [[ 1 -eq 11 ]] 2020-10-12T11:01:32.106 INFO:tasks.workunit.client.0.smithi057.stderr:+ return 1 2020-10-12T11:01:32.106 INFO:tasks.workunit.client.0.smithi057.stderr:+ exit 1
old_first_pinned=$first_pinned old_last_pinned=$last_pinned first_pinned=$(jq '.osdmap_manifest.first_pinned' <<< $report) last_pinned=$(jq '.osdmap_manifest.last_pinned' <<< $report) [[ $first_pinned -eq $trim_to ]] || return 1
/a/teuthology-2020-10-12_07:01:02-rados-master-distro-basic-smithi/5519390
Updated by Neha Ojha over 5 years ago
Did not reproduce here: https://pulpito.ceph.com/nojha-2020-10-12_19:42:27-rados:monthrash-master-distro-basic-smithi/
Trying with more jobs: https://pulpito.ceph.com/nojha-2020-10-12_21:25:34-rados:monthrash-master-distro-basic-smithi/
Updated by Neha Ojha over 5 years ago
ok, fails 1 out 10 times but seems new, need to look more.
2020-10-12T23:55:24.502 INFO:tasks.workunit.client.0.smithi003.stderr:++ jq .osdmap_first_committed 2020-10-12T23:55:24.502 INFO:tasks.workunit.client.0.smithi003.stderr:+ fc=20 2020-10-12T23:55:24.502 INFO:tasks.workunit.client.0.smithi003.stderr:+ [[ 20 -eq 20 ]] 2020-10-12T23:55:24.502 INFO:tasks.workunit.client.0.smithi003.stderr:+ old_first_pinned=11 2020-10-12T23:55:24.502 INFO:tasks.workunit.client.0.smithi003.stderr:+ old_last_pinned=921 2020-10-12T23:55:24.502 INFO:tasks.workunit.client.0.smithi003.stderr:++ jq .osdmap_manifest.first_pinned 2020-10-12T23:55:24.504 INFO:tasks.workunit.client.0.smithi003.stderr:+ first_pinned=11 2020-10-12T23:55:24.504 INFO:tasks.workunit.client.0.smithi003.stderr:++ jq .osdmap_manifest.last_pinned 2020-10-12T23:55:24.504 INFO:tasks.workunit.client.0.smithi003.stderr:+ last_pinned=931 2020-10-12T23:55:24.504 INFO:tasks.workunit.client.0.smithi003.stderr:+ pinned_maps=($(jq '.osdmap_manifest.pinned_maps[]' <<< $report)) 2020-10-12T23:55:24.504 INFO:tasks.workunit.client.0.smithi003.stderr:++ jq '.osdmap_manifest.pinned_maps[]' 2020-10-12T23:55:24.505 INFO:tasks.workunit.client.0.smithi003.stderr:+ [[ 11 -eq 20 ]] 2020-10-12T23:55:24.505 INFO:tasks.workunit.client.0.smithi003.stderr:+ return 1
/a/nojha-2020-10-12_21:25:34-rados:monthrash-master-distro-basic-smithi/5520847
Updated by Neha Ojha about 5 years ago
/a/yuriw-2021-03-10_21:08:51-rados-wip-yuri8-testing-2021-03-10-0901-pacific-distro-basic-smithi/5954442 - similar
Updated by Deepika Upadhyay over 4 years ago
2021-10-19T05:03:54.608 INFO:tasks.workunit.client.0.smithi139.stderr:++ jq .osdmap_first_committed 2021-10-19T05:03:54.610 DEBUG:teuthology.orchestra.run:got remote process result: 1 2021-10-19T05:03:54.611 INFO:tasks.workunit.client.0.smithi139.stderr:+ fc=11 2021-10-19T05:03:54.611 INFO:tasks.workunit.client.0.smithi139.stderr:+ [[ 11 -eq 11 ]] 2021-10-19T05:03:54.612 INFO:tasks.workunit.client.0.smithi139.stderr:+ old_first_pinned=1 2021-10-19T05:03:54.612 INFO:tasks.workunit.client.0.smithi139.stderr:+ old_last_pinned=921 2021-10-19T05:03:54.612 INFO:tasks.workunit.client.0.smithi139.stderr:++ jq .osdmap_manifest.first_pinned 2021-10-19T05:03:54.612 INFO:tasks.workunit.client.0.smithi139.stderr:+ first_pinned=1 2021-10-19T05:03:54.613 INFO:tasks.workunit.client.0.smithi139.stderr:++ jq .osdmap_manifest.last_pinned 2021-10-19T05:03:54.613 INFO:tasks.workunit.client.0.smithi139.stderr:+ last_pinned=921 2021-10-19T05:03:54.613 INFO:tasks.workunit.client.0.smithi139.stderr:+ [[ 1 -eq 11 ]] 2021-10-19T05:03:54.613 INFO:tasks.workunit.client.0.smithi139.stderr:+ return 1 2021-10-19T05:03:54.614 INFO:tasks.workunit.client.0.smithi139.stderr:+ exit 1 2021-10-19T05:03:54.614 INFO:tasks.workunit:Stopping ['mon/test_mon_osdmap_prune.sh'] on client.0...
/ceph/teuthology-archive/yuriw-2021-10-18_19:03:43-rados-wip-yuri5-testing-2021-10-18-0906-octopus-distro-basic-smithi/6449226/teuthology.log
Updated by Neha Ojha about 4 years ago
- Related to Bug #54369: mon/test_mon_osdmap_prune.sh: jq .osdmap_first_committed [[ 11 -eq 20 ]] added
Updated by Laura Flores about 4 years ago
/a/yuriw-2022-03-01_17:45:51-rados-wip-yuri3-testing-2022-02-28-0757-pacific-distro-default-smithi/6714654
Updated by Laura Flores almost 4 years ago
/a/yuriw-2022-04-01_17:44:32-rados-wip-yuri3-testing-2022-04-01-0659-distro-default-smithi/6772697
Updated by Neha Ojha almost 4 years ago
- Assignee set to Aishwarya Mathuria
Aishwarya, can you please take a look at this bug? could be a test issue, but we should find out
Updated by Aishwarya Mathuria almost 4 years ago
Neha Ojha wrote:
Aishwarya, can you please take a look at this bug? could be a test issue, but we should find out
Sure!
Updated by Laura Flores about 3 years ago
- Tags set to test-failure
/a/lflores-2023-02-08_20:25:06-rados-wip-lflores-testing-2023-02-06-1529-distro-default-smithi/7162262
2023-02-09T08:18:28.668 INFO:tasks.workunit.client.0.smithi119.stderr:+ fc=20
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ [[ 20 -eq 20 ]]
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ old_first_pinned=11
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ old_last_pinned=921
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:++ jq .osdmap_manifest.first_pinned
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ first_pinned=11
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:++ jq .osdmap_manifest.last_pinned
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ last_pinned=931
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ pinned_maps=($(jq '.osdmap_manifest.pinned_maps[]' <<< $report))
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:++ jq '.osdmap_manifest.pinned_maps[]'
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ [[ 11 -eq 20 ]]
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ return 1
2023-02-09T08:18:28.669 INFO:tasks.workunit.client.0.smithi119.stderr:+ exit 1
2023-02-09T08:18:28.669 INFO:tasks.workunit:Stopping ['mon/test_mon_osdmap_prune.sh'] on client.0...
Updated by Aishwarya Mathuria about 3 years ago
From the monitor logs during OSD map trim:
2023-02-09T08:18:47.488+0000 7fae6951b700 10 mon.c@4(leader).osd e1032 min_last_epoch_clean 1030
2023-02-09T08:18:47.488+0000 7fae6951b700 10 mon.c@4(leader).osd e1032 get_trim_to explicit mon_osd_force_trim_to = 20
2023-02-09T08:18:47.488+0000 7fae6951b700 10 mon.c@4(leader).osd e1032 get_trim_to trim_to = 20
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) maybe_trim 11~20
2023-02-09T08:18:47.488+0000 7fae6951b700 10 mon.c@4(leader).paxosservice(osdmap 11..1032) maybe_trim trimming to 20, 9 states
2023-02-09T08:18:47.488+0000 7fae6951b700 10 mon.c@4(leader).paxosservice(osdmap 11..1032) trim from 11 to 20
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 11
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim full_11
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 12
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 13
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 14
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 15
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 16
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 17
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 18
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) trim 19
2023-02-09T08:18:47.488+0000 7fae6951b700 20 mon.c@4(leader).paxosservice(osdmap 11..1032) compacting prefix osdmap
2023-02-09T08:18:47.488+0000 7fae6951b700 10 mon.c@4(leader).osd e1032 encode_trim_extra including full map for e 20
2023-02-09T08:18:47.488+0000 7fae6951b700 10 mon.c@4(leader).osd e1032 _prune_update_trimmed first 20 last_pinned 931
The trim is happening, so now I am looking into why osdmap_manifest.first_pinned is not getting updated.
Updated by Radoslaw Zarzynski about 3 years ago
- Status changed from New to In Progress
Updated by Laura Flores about 3 years ago
/a/yuriw-2023-03-03_17:39:09-rados-reef-distro-default-smithi/7193126
2023-03-04T00:12:50.125 INFO:tasks.workunit.client.0.smithi112.stderr:+ fc=11
2023-03-04T00:12:50.125 INFO:tasks.workunit.client.0.smithi112.stderr:+ [[ 11 -eq 11 ]]
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:+ old_first_pinned=1
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:+ old_last_pinned=921
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:++ jq .osdmap_manifest.first_pinned
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:+ first_pinned=1
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:++ jq .osdmap_manifest.last_pinned
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:+ last_pinned=921
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:+ [[ 1 -eq 11 ]]
2023-03-04T00:12:50.126 INFO:tasks.workunit.client.0.smithi112.stderr:+ return 1
2023-03-04T00:12:50.127 INFO:tasks.workunit.client.0.smithi112.stderr:+ exit 1
2023-03-04T00:12:50.127 INFO:tasks.workunit:Stopping ['mon/test_mon_osdmap_prune.sh'] on client.0...
2023-03-04T00:12:50.127 DEBUG:teuthology.orchestra.run.smithi112:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2023-03-04T00:12:50.388 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
Updated by Laura Flores about 3 years ago
- Backport changed from pacific, octopus to pacific, octopus, quincy, reef
Updated by Laura Flores almost 3 years ago
/a/yuriw-2023-03-30_21:53:20-rados-wip-yuri7-testing-2023-03-29-1100-distro-default-smithi/7228118
Updated by Radoslaw Zarzynski almost 3 years ago
Updated by Laura Flores over 2 years ago
/a/yuriw-2023-08-17_21:18:20-rados-wip-yuri11-testing-2023-08-17-0823-distro-default-smithi/7372203
Updated by Laura Flores over 1 year ago
/a/yuriw-2024-07-17_13:32:02-rados-wip-yuri12-testing-2024-07-16-1122-distro-default-smithi/7805875
Updated by Laura Flores over 1 year ago
- Tags changed from test-failure to test-failure, main-failures
Updated by Radoslaw Zarzynski over 1 year ago
@Aishwarya Mathuria: what are the plans for fixing?
Updated by Konstantin Shalygin over 1 year ago
- Backport changed from pacific, octopus, quincy, reef to octopus, quincy, reef
Updated by Konstantin Shalygin over 1 year ago
- Backport changed from octopus, quincy, reef to quincy, reef
Updated by Aishwarya Mathuria over 1 year ago
@Radoslaw Zarzynski started working on a fix this week. Thanks for the reminder.
Updated by Aishwarya Mathuria over 1 year ago
- Related to Bug #68093: workunits/mon/test_mon_osdmap_prune.sh: first_pinned not equal to trim to value added
Updated by Aishwarya Mathuria over 1 year ago
Update:
In addition to the explanation in the gist above, I was trying to figure out if this is a test issue or an error.
In the test, we assume the trimming of osdmaps is done if osdmap_first_committed is equal to the epoch of the map upto which we had to trim.
osdmap_first_committed however returns the value of cached_first_committed which is updated before a trim actually happens as explained in the gist above.
Even if the Paxos trim proposal fails, according to the test the trim was successful as the cached_first_committed value has been changed.
Normally this is not a problem as when refresh() happens, cached_first_committed will be corrected to the value before the trim was proposed.
This scenario is explained here as well in the PR where cached_first_committed is introduced - https://github.com/ceph/ceph/pull/19397#issuecomment-356557679
I think changing the test to verify that osdmap_manifest.first_pinned has changed to the correct value is a better way to signify that the trim is complete.
I will test out this fix and update here.
Updated by Radoslaw Zarzynski over 1 year ago
- Backport changed from quincy, reef to quincy,reef,squid
Updated by Radoslaw Zarzynski over 1 year ago
@Aishwarya Mathuria: what's the status?
Updated by Laura Flores over 1 year ago
Writing from bug scrub.. @Aishwarya Mathuria feel free to bring this to the rados call if you need more eyes on it. Thanks for testing a fix!
Updated by Laura Flores over 1 year ago
Note from bug scrub: @Aishwarya Mathuria any updates on this one?
Updated by Radoslaw Zarzynski about 1 year ago
- Status changed from In Progress to Fix Under Review
Updated by Radoslaw Zarzynski about 1 year ago
scrub note: approved, awaits QA.
Updated by Laura Flores about 1 year ago
- Status changed from Fix Under Review to Pending Backport
- Backport changed from quincy,reef,squid to reef,squid
Updated by Upkeep Bot about 1 year ago
- Copied to Backport #69773: reef: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to added
Updated by Upkeep Bot about 1 year ago
- Copied to Backport #69774: squid: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to added
Updated by Upkeep Bot about 1 year ago
- Tags (freeform) set to backport_processed
Updated by Laura Flores 11 months ago
/a/yuriw-2025-04-15_19:42:43-rados-reef-release-distro-default-smithi/8243559
Updated by Upkeep Bot 9 months ago
- Merge Commit set to a39568e579568a733047901081451fe340f4d913
- Fixed In set to v19.3.0-7259-ga39568e5795
- Upkeep Timestamp set to 2025-07-09T18:56:32+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v19.3.0-7259-ga39568e5795 to v19.3.0-7259-ga39568e579
- Upkeep Timestamp changed from 2025-07-09T18:56:32+00:00 to 2025-07-14T18:12:45+00:00
Updated by Konstantin Shalygin 7 months ago
- Status changed from Pending Backport to Resolved
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~1233
- Upkeep Timestamp changed from 2025-07-14T18:12:45+00:00 to 2025-11-01T01:28:57+00:00