Project

General

Profile

Actions

Bug #75355

open

Rocky10 - smoke/basic: RuntimeError: Fuse mount failed to populate/sys/ after 31 seconds

Added by Laura Flores 15 days ago. Updated 3 days ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
-
Category:
Testing
Target version:
% Done:

0%

Source:
Q/A
Backport:
tentacle
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:
Tags:

Description

Failed job:

/a/yuriw-2026-03-04_22:27:22-smoke-wip-rocky10-branch-of-the-day-2026-03-04-1772633736-distro-default-trial/83978

2026-03-05T12:47:39.803 INFO:teuthology.orchestra.run.trial160.stdout:/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0
2026-03-05T12:47:39.803 INFO:teuthology.orchestra.run.trial160.stdout:/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0
2026-03-05T12:47:39.804 INFO:teuthology.orchestra.run:Running command with timeout 300
2026-03-05T12:47:39.804 DEBUG:teuthology.orchestra.run.trial160:> ls /sys/fs/fuse/connections
2026-03-05T12:47:40.952 DEBUG:teuthology.orchestra.run.trial160:> sudo modprobe fuse
2026-03-05T12:47:40.962 DEBUG:teuthology.orchestra.run.trial160:> cat /proc/self/mounts | awk '{print $2}'
...
2026-03-05T12:48:17.104 WARNING:tasks.cephfs.fuse_mount:Trying to clean up after failed mount
2026-03-05T12:48:17.105 DEBUG:teuthology.orchestra.run.trial160:> set -ex
2026-03-05T12:48:17.105 DEBUG:teuthology.orchestra.run.trial160:> dd if=/proc/self/mounts of=/dev/stdout
2026-03-05T12:48:17.109 DEBUG:tasks.cephfs.mount:not mounted; /proc/self/mounts is:
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,nosuid,relatime,size=65421872k,nr_inodes=16355468,mode=755,inode64 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,noexec,relatime,size=13096172k,mode=755,inode64 0 0
/dev/nvme0n1p2 / ext4 rw,relatime 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k,inode64 0 0
cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0
bpf /sys/fs/bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=17394 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime,pagesize=2M 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
tracefs /sys/kernel/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0
configfs /sys/kernel/config configfs rw,nosuid,nodev,noexec,relatime 0 0
none /run/credentials/systemd-sysusers.service ramfs ro,nosuid,nodev,noexec,relatime,mode=700 0 0
/dev/loop0 /snap/core20/2686 squashfs ro,nodev,relatime,errors=continue 0 0
/dev/loop1 /snap/lxd/36918 squashfs ro,nodev,relatime,errors=continue 0 0
/dev/loop2 /snap/snapd/25935 squashfs ro,nodev,relatime,errors=continue 0 0
/dev/nvme0n1p1 /boot/efi vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0
sunrpc /run/rpc_pipefs rpc_pipefs rw,relatime 0 0
tmpfs /run/snapd/ns tmpfs rw,nosuid,nodev,noexec,relatime,size=13096172k,mode=755,inode64 0 0
nsfs /run/snapd/ns/lxd.mnt nsfs rw 0 0
tmpfs /run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=13096168k,nr_inodes=3274042,mode=700,uid=1000,gid=1002,inode64 0 0
/dev/mapper/vg_nvme-lv_5 /var/lib/ceph xfs rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
tmpfs /run/netns tmpfs rw,nosuid,nodev,noexec,relatime,size=13096172k,mode=755,inode64 0 0
nsfs /run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0 nsfs rw 0 0
nsfs /run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0 nsfs rw 0 0

2026-03-05T12:48:17.109 DEBUG:tasks.cephfs.fuse_mount:ceph-fuse client.0 is not mounted at ubuntu@trial160.front.sepia.ceph.com /home/ubuntu/cephtest/mnt.0
2026-03-05T12:48:17.109 INFO:tasks.cephfs.mount:Cleaning up mount ubuntu@trial160.front.sepia.ceph.com
2026-03-05T12:48:17.109 INFO:teuthology.orchestra.run:Running command with timeout 300
2026-03-05T12:48:17.110 DEBUG:teuthology.orchestra.run.trial160:> (cd /home/ubuntu/cephtest && exec rmdir -- /home/ubuntu/cephtest/mnt.0)
2026-03-05T12:48:17.157 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_teuthology_6e3d0419ccca0e75a03e08feb708946e198236f4/teuthology/run_tasks.py", line 112, in run_tasks
    manager.__enter__() 
  File "/usr/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/teuthworker/src/github.com_ceph_ceph-c_ed3b19d231df1ddad45c9e95b328a8c12776aa5d/qa/tasks/ceph_fuse.py", line 234, in task
    mount_x.mount(mntopts=config.get('mntopts', []), mntargs=mntargs)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_ed3b19d231df1ddad45c9e95b328a8c12776aa5d/qa/tasks/cephfs/fuse_mount.py", line 52, in mount
    return self._mount(mntopts, mntargs, check_status)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/teuthworker/src/github.com_ceph_ceph-c_ed3b19d231df1ddad45c9e95b328a8c12776aa5d/qa/tasks/cephfs/fuse_mount.py", line 68, in _mount
    retval = self._run_mount_cmd(mntopts, mntargs, check_status)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/teuthworker/src/github.com_ceph_ceph-c_ed3b19d231df1ddad45c9e95b328a8c12776aa5d/qa/tasks/cephfs/fuse_mount.py", line 94, in _run_mount_cmd
    return self._wait_and_record_our_fuse_conn(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/teuthworker/src/github.com_ceph_ceph-c_ed3b19d231df1ddad45c9e95b328a8c12776aa5d/qa/tasks/cephfs/fuse_mount.py", line 189, in _wait_and_record_our_fuse_conn
    raise RuntimeError( 
RuntimeError: Fuse mount failed to populate/sys/ after 31 seconds 

Several more failed in the same run:
3 jobs: ['83978', '83979', '83982']

Passed job (exact same job description):

zack-2026-02-04_16:34:20-smoke-main-distro-default-trial/35060

2026-02-04T19:12:05.560 INFO:teuthology.orchestra.run.trial095.stdout:/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0
2026-02-04T19:12:05.560 INFO:teuthology.orchestra.run.trial095.stdout:/run/netns/ceph-ns--home-ubuntu-cephtest-mnt.0
2026-02-04T19:12:05.561 INFO:teuthology.orchestra.run:Running command with timeout 300
2026-02-04T19:12:05.561 DEBUG:teuthology.orchestra.run.trial095:> ls /sys/fs/fuse/connections
2026-02-04T19:12:05.575 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.trial095.stderr:2026-02-04T19:12:05.573+0000 7fb77d837680 -1 init, newargv = 0x557483dd1ab0 newargc=15
2026-02-04T19:12:05.575 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.trial095.stderr:ceph-fuse[11681]: starting ceph client
2026-02-04T19:12:05.580 INFO:tasks.cephfs.fuse_mount.ceph-fuse.0.trial095.stderr:ceph-fuse[11681]: starting fuse
2026-02-04T19:12:06.680 DEBUG:teuthology.orchestra.run.trial095:> sudo modprobe fuse
2026-02-04T19:12:06.689 DEBUG:teuthology.orchestra.run.trial095:> cat /proc/self/mounts | awk '{print $2}'

Almost certain that this is not related to Rocky10 or the related Python changes, but given that this issue was not previously tracked anywhere, and I can't find it on any other main baselines, we should rule out any relation.


Related issues 8 (2 open6 closed)

Related to Ceph QA - QA Run #75339: wip-rocky10-branch-of-the-day-2026-03-04-1772633736QA TestingYaarit HatukaActions
Related to CephFS - Bug #75376: qa: Fuse mount failed to populate/sys/ after 31 secondsDuplicate

Actions
Related to CephFS - Bug #75357: Rocky10 - smoke/basic: CephFS kernel mount failsDuplicate

Actions
Related to Ceph QA - QA Run #75543: wip-rocky10-branch-of-the-day-2026-03-16-1773712510QA TestingActions
Has duplicate CephFS - Bug #75394: fs/misc appears to hang and timeout on rocky 10.1Duplicate

Actions
Has duplicate CephFS - Bug #75392: fsstress.sh appears to hang and time out on rocky 10.1Duplicate

Actions
Has duplicate CephFS - Bug #75393: fsync-tester.sh appears to hang and timeout on rocky 10.1Duplicate

Actions
Has duplicate CephFS - Bug #75357: Rocky10 - smoke/basic: CephFS kernel mount failsDuplicate

Actions
Actions #1

Updated by Laura Flores 15 days ago

  • Related to QA Run #75339: wip-rocky10-branch-of-the-day-2026-03-04-1772633736 added
Actions #2

Updated by Venky Shankar 14 days ago

  • Related to Bug #75376: qa: Fuse mount failed to populate/sys/ after 31 seconds added
Actions #3

Updated by Venky Shankar 14 days ago

Laura Flores wrote:

Almost certain that this is not related to Rocky10 or the related Python changes, but given that this issue was not previously tracked anywhere, and I can't find it on any other main baselines, we should rule out any relation.

And this happens on Ubuntu host, so its definitely not rocky10. But as I noted in the (dup) tracker I had opened: https://tracker.ceph.com/issues/75376

"The test is on Ubuntu host however, so it might just be python related. The reason for the failure seems to be related to fuse connections not getting populated in /sys/fs/fuse/connections."

Actions #4

Updated by Laura Flores 14 days ago

/a/yuriw-2026-03-05_23:35:09-smoke-wip-rocky10-branch-of-the-day-2026-03-04-1772633736-distro-default-trial
3 jobs: ['90044', '90043', '90047']

Actions #5

Updated by Nitzan Mordechai 12 days ago

/a/yuriw-2026-03-07_15:37:31-smoke-wip-rocky10-branch-of-the-day-2026-03-06-1772840606-distro-default-trial/
6 jobs: ['92908', '92914', '92916', '92915', '92919', '92917']

Actions #6

Updated by Venky Shankar 11 days ago

For failed job: /a/yuriw-2026-03-05_16:00:50-fs-wip-rocky10-branch-of-the-day-2026-03-04-1772633736-distro-default-trial/89336

2026-03-05T21:13:50.747+0000 7fa056799680  1 --2- 192.168.144.1:0/1038009080 >> [v2:10.20.193.190:6836/2122419449,v1:10.20.193.190:6837/2122419449] conn(0x562ccd68baa0 0x562ccd6abf00 unknown :-1 s=NONE pgs=0 gs=0 cs=0 l=0 c_cookie=0 s_cookie=0 reconnecting=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect
2026-03-05T21:13:50.747+0000 7fa056799680  1 -- 192.168.144.1:0/1038009080 --> [v2:10.20.193.190:6836/2122419449,v1:10.20.193.190:6837/2122419449] -- client_session(request_open) -- 0x562ccd6ac5d0 con 0x562ccd68baa0
2026-03-05T21:13:50.747+0000 7fa056799680 10 client.4803 waiting for session to mds.0 to open
2026-03-05T21:13:51.747+0000 7fa035ffb640 20 client.4803 tick
2026-03-05T21:13:51.747+0000 7fa035ffb640 10 client.4803 renew_caps()
2026-03-05T21:13:51.747+0000 7fa035ffb640 15 client.4803 renew_caps requesting from mds.0
2026-03-05T21:13:51.747+0000 7fa035ffb640 10 client.4803 renew_caps mds.0
2026-03-05T21:13:51.747+0000 7fa035ffb640  1 -- 192.168.144.1:0/1038009080 --> [v2:10.20.193.190:6836/2122419449,v1:10.20.193.190:6837/2122419449] -- client_session(request_renewcaps seq 1) -- 0x7fa024001d90 con 0x562ccd68baa0

Connection to MDS doesn't seem to have gone through

2026-03-05T21:14:20.771+0000 7fa0377fe640  1 -- 192.168.144.1:0/1038009080 <== mon.0 v2:10.20.193.167:3300/0 7 ==== osd_map(22..22 src has 1..22) ==== 312+0+0 (secure 0 0 0) 0x7fa04001b440 con 0x7fa03c023cd0
2026-03-05T21:14:21.379+0000 7fa0477fe640  1 -- 192.168.144.1:0/1038009080 >> [v2:10.20.193.190:6836/2122419449,v1:10.20.193.190:6837/2122419449] conn(0x562ccd68baa0 msgr2=0x562ccd6abf00 unknown :-1 s=STATE_CONNECTING_RE l=0).tick see no progress in more than 10000000 us during connecting to v2:10.20.193.190:6836/2122419449, fault.
2026-03-05T21:14:21.379+0000 7fa0477fe640  1 --2- 192.168.144.1:0/1038009080 >> [v2:10.20.193.190:6836/2122419449,v1:10.20.193.190:6837/2122419449] conn(0x562ccd68baa0 0x562ccd6abf00 unknown :-1 s=START_CONNECT pgs=0 gs=4 cs=0 l=0 c_cookie=0 s_cookie=0 reconnecting=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._fault waiting 0.800000
2026-03-05T21:14:21.755+0000 7fa035ffb640 20 client.4803 tick
2026-03-05T21:14:21.755+0000 7fa035ffb640 20 client.4803 collect_and_send_metrics
2026-03-05T21:14:21.755+0000 7fa035ffb640 20 client.4803 collect_and_send_global_metrics
2026-03-05T21:14:21.755+0000 7fa035ffb640  5 client.4803 collect_and_send_global_metrics: no session with rank=0 -- not sending metric
2026-03-05T21:14:21.755+0000 7fa035ffb640 20 client.4803 trim_cache size 0 max 16384
2026-03-05T21:14:21.755+0000 7fa035ffb640 20 client.4803 upkeep thread waiting interval 1.000000000s

The MDS logs does not have any traces for the client id:

vshankar@trial200:/a/yuriw-2026-03-05_16:00:50-fs-wip-rocky10-branch-of-the-day-2026-03-04-1772633736-distro-default-trial/89336$ find . -name "ceph-mds*" | xargs zgrep "client.4803" 
vshankar@trial200:/a/yuriw-2026-03-05_16:00:50-fs-wip-rocky10-branch-of-the-day-2026-03-04-1772633736-distro-default-trial/89336$
Actions #7

Updated by Venky Shankar 11 days ago

See discussion https://github.com/ceph/ceph/pull/66294/changes#r2903653313 for a possible explanation.

Actions #8

Updated by Venky Shankar 11 days ago

  • Related to Bug #75357: Rocky10 - smoke/basic: CephFS kernel mount fails added
Actions #9

Updated by Nitzan Mordechai 10 days ago

/a/yuriw-2026-03-09_20:51:09-smoke-wip-rocky10-branch-of-the-day-2026-03-09-1773078259-distro-default-trial/
6 jobs: ['95314', '95321', '95325', '95303', '95306', '95323']

Actions #10

Updated by Nitzan Mordechai 10 days ago

/a/yuriw-2026-03-09_21:03:30-smoke-wip-rocky10-branch-of-the-day-2026-03-09-1773079353-tentacle-distro-default-trial/
7 jobs: ['96223', '96191', '96229', '96197', '96207', '96195', '96225']

Actions #11

Updated by Brad Hubbard 10 days ago

  • Has duplicate Bug #75394: fs/misc appears to hang and timeout on rocky 10.1 added
Actions #12

Updated by Brad Hubbard 10 days ago

  • Has duplicate Bug #75392: fsstress.sh appears to hang and time out on rocky 10.1 added
Actions #13

Updated by Brad Hubbard 10 days ago

  • Has duplicate Bug #75393: fsync-tester.sh appears to hang and timeout on rocky 10.1 added
Actions #14

Updated by Venky Shankar 4 days ago

  • Category set to Testing
  • Status changed from New to Fix Under Review
  • Target version set to v21.0.0
  • Source set to Q/A
  • Backport set to tentacle
  • Pull request ID set to 66294
Actions #15

Updated by Venky Shankar 4 days ago

  • Has duplicate Bug #75357: Rocky10 - smoke/basic: CephFS kernel mount fails added
Actions #16

Updated by Nitzan Mordechai 3 days ago

/a/nmordech-2026-03-17_06:15:36-rados-wip-rocky10-branch-of-the-day-2026-03-16-1773712510-distro-default-trial/
2 jobs: ['103735', '103669']

Actions #17

Updated by Laura Flores 3 days ago

  • Related to QA Run #75543: wip-rocky10-branch-of-the-day-2026-03-16-1773712510 added
Actions

Also available in: Atom PDF