Project

General

Profile

Actions

Bug #74574

open

Error reimaging machines: reached maximum tries (120) after waiting for 1800 seconds

Added by Laura Flores about 2 months ago. Updated about 1 month ago.

Status:
In Progress
Priority:
Normal
% Done:

0%

Source:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Tags (freeform):

Description

There are 19 dead jobs on this run that failed from this issue:

https://pulpito.ceph.com/yuriw-2026-01-22_22:40:58-rados-wip-yuri12-testing-2026-01-22-2045-distro-default-trial/

Failure Reason:
Error reimaging machines: reached maximum tries (120) after waiting for 1800 seconds

Actions #1

Updated by Laura Flores about 2 months ago

  • Description updated (diff)
Actions #3

Updated by David Galloway about 2 months ago

  • Status changed from New to In Progress
  • Assignee set to David Galloway

There are a couple dozen systems that have failed reimage 10 times. Each will need to be looked at.

Actions #5

Updated by Nitzan Mordechai about 1 month ago · Edited

https://pulpito.ceph.com/yaarit-2026-02-10_23:44:29-rados:cephadm-wip-rocky10-branch-of-the-day-2026-02-09-1770676549-distro-default-trial/44142/

2026-02-11T00:10:22.596 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_teuthology_c433f1062990a0488dc29a553589bc609a460691/teuthology/contextutil.py", line 30, in nested
    vars.append(enter())
                ^^^^^^^
  File "/usr/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/teuthworker/src/git.ceph.com_ceph-c_adddf0ecd2d88c7de83a50c9f262beccb2f8584f/qa/tasks/cephadm.py", line 1120, in ceph_osds
    while proceed():
          ^^^^^^^^^
  File "/home/teuthworker/src/git.ceph.com_teuthology_c433f1062990a0488dc29a553589bc609a460691/teuthology/contextutil.py", line 134, in __call__
    raise MaxWhileTries(error_msg)
teuthology.exceptions.MaxWhileTries: reached maximum tries (120) after waiting for 120 seconds
Actions #6

Updated by David Galloway about 1 month ago

yaarit-2026-02-10_23:44:29-rados:cephadm-wip-rocky10-branch-of-the-day-2026-02-09-1770676549-distro-default-trial/44142/ is a different issue. That is the cephadm task. Not infra.

Actions

Also available in: Atom PDF