Bug #53859
openqa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm)
0%
Description
Started seeing this lately - https://pulpito.ceph.com/vshankar-2022-01-13_04:42:40-fs-wip-vshankar-testing-20220106-145819-testing-default-smithi/6612000/
Also happens in master - https://pulpito.ceph.com/vshankar-2022-01-13_11:30:06-fs-master-testing-default-smithi/6612583/
Updated by Patrick Donnelly about 4 years ago
Relevant bits from the log:
2022-01-13T05:23:31.208+0000 7f32cffff700 3 client.5688 ll_open 0x10000000000.head 2 = 0 (0x7f32c8002d00)
2022-01-13T05:23:31.209+0000 7f32cffff700 8 client.5688 _ll_forget 0x10000000000 1
2022-01-13T05:23:31.209+0000 7f32cffff700 20 client.5688 _ll_put 0x7f32e0009670 0x10000000000 1 -> 1
2022-01-13T05:23:31.209+0000 7f32e77fe700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/0 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1
2022-01-13T05:23:31.209+0000 7f32e77fe700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/0 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1
2022-01-13T05:23:31.209+0000 7f32e4ff9700 20 client.5688 _ll_get 0x7f32e0009670 0x10000000000 -> 2
2022-01-13T05:23:31.209+0000 7f32e4ff9700 8 client.5688 _ll_getattr 0x10000000000.head
2022-01-13T05:23:31.209+0000 7f32e4ff9700 10 client.5688 _getattr mask pAsLsXsFs issued=1
2022-01-13T05:23:31.209+0000 7f32e4ff9700 10 client.5688 fill_stat on 0x10000000000 snap/devhead mode 0100664 mtime 2022-01-13T05:23:18.645267+0000 ctime 2022-01-13T05:23:18.646363+0000
2022-01-13T05:23:31.209+0000 7f32e4ff9700 3 client.5688 ll_getattr 0x10000000000.head = 0
2022-01-13T05:23:31.209+0000 7f32e4ff9700 8 client.5688 _ll_forget 0x10000000000 1
2022-01-13T05:23:31.209+0000 7f32e4ff9700 20 client.5688 _ll_put 0x7f32e0009670 0x10000000000 1 -> 1
2022-01-13T05:23:31.212+0000 7f32e77fe700 1 -- 192.168.0.1:0/2718057660 <== mds.0 v2:172.21.15.153:6835/2865775467 6 ==== client_caps(grant ino 0x10000000000 5 seq 3 caps=pAsxLsXsxFsxcrwb dirty=- wanted=pAsxXsxFsxcrwb follows 0 size 0/4194304 ts 1/18446744073709551615 mtime 2022-01-13T05:23:18.645267+0000 tws 2) v11 ==== 252+0+0 (crc 0 0 0) 0x7f32f400e040 con 0x560ab8f738a0
2022-01-13T05:23:31.212+0000 7f32e77fe700 5 client.5688 set_cap_epoch_barrier epoch = 57
2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 mds.0 seq now 1
2022-01-13T05:23:31.212+0000 7f32e77fe700 5 client.5688 handle_cap_grant on in 0x10000000000 mds.0 seq 3 caps now pAsxLsXsxFsxcrwb was pAsxLsXsxFsxcrwb
2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 update_inode_file_time 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/0 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) pAsxLsXsxFsxcrwb ctime 2022-01-13T05:23:18.646363+0000 mtime 2022-01-13T05:23:18.645267+0000
2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 max_size 0 -> 4194304
2022-01-13T05:23:31.212+0000 7f32e77fe700 10 client.5688 caps unchanged at pAsxLsXsxFsxcrwb
2022-01-13T05:23:31.213+0000 7f32cffff700 3 client.5688 ll_flush 0x7f32c8002d00 0x10000000000
2022-01-13T05:23:31.213+0000 7f32cffff700 10 client.5688 _flush: 0x7f32c8002d00 on inode 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) no async_err state
2022-01-13T05:23:31.213+0000 7f32e4ff9700 3 client.5688 ll_release (fh)0x7f32c8002d00 0x10000000000
2022-01-13T05:23:31.213+0000 7f32e4ff9700 8 client.5688 _release_fh 0x7f32c8002d00 mode 3 on 0x10000000000.head(faked_ino=0 nref=13 ll_ref=1 cap_refs={} open={3=1} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670)
2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 _flush 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670)
2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 nothing to flush
2022-01-13T05:23:31.213+0000 7f32e4ff9700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1
2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 check_caps on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) wanted - used - issued pAsxLsXsxFsxcrwb revoking - flags=0
2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 cap mds.0 issued pAsxLsXsxFsxcrwb implemented pAsxLsXsxFsxcrwb revoking -
2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 delaying cap release
2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 cap_delay_requeue on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670)
2022-01-13T05:23:31.213+0000 7f32e4ff9700 10 client.5688 _release_fh 0x7f32c8002d00 on inode 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) no async_err state
2022-01-13T05:23:31.213+0000 7f32e4ff9700 20 client.5688 put_inode on 0x10000000000.head(faked_ino=0 nref=14 ll_ref=1 cap_refs={} open={3=0} mode=100664 size=0/4194304 nlink=1 btime=2022-01-13T05:23:18.640986+0000 mtime=2022-01-13T05:23:18.645267+0000 ctime=2022-01-13T05:23:18.646363+0000 caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x10000000000 ts 0/0 objects 0 dirty_or_tx 0] parents=0x1.head["test_file"] 0x7f32e0009670) n = 1
From: /ceph/teuthology-archive/vshankar-2022-01-13_04:42:40-fs-wip-vshankar-testing-20220106-145819-testing-default-smithi/6612000/remote/smithi052/log/ceph-client.0.51171.log.gz
Don't see a read call... maybe it's being optimized out since the file is known to be empty? If so, not sure how this ever worked. Would suggest experimenting with a file with contents to see if that makes a difference.
Updated by Xiubo Li about 4 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 44679
Updated by Xiubo Li about 4 years ago
When the 'test_file' is empty the kernel won't trigger go do the real read, so the libcephfs won't get a chance to call read. Seems will call a getattr before calling read do decide whether to call read or not.
Updated by Xiubo Li about 4 years ago
From kernel fs and fuse code:
1567 static ssize_t fuse_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
1568 {
1569 struct file *file = iocb->ki_filp;
1570 struct fuse_file *ff = file->private_data;
1571 struct inode *inode = file_inode(file);
1572
1573 if (fuse_is_bad(inode))
1574 return -EIO;
1575
1576 if (FUSE_IS_DAX(inode))
1577 return fuse_dax_read_iter(iocb, to);
1578
1579 if (!(ff->open_flags & FOPEN_DIRECT_IO))
1580 return fuse_cache_read_iter(iocb, to);
1581 else
1582 return fuse_direct_read_iter(iocb, to);
1583 }
In Line#1580 above when doing the cache read it will update the attribute first, which will fire a 'getattr' request to libcephfs:
989 static ssize_t fuse_cache_read_iter(struct kiocb *iocb, struct iov_iter *to)
990 {
991 struct inode *inode = iocb->ki_filp->f_mapping->host;
992 struct fuse_conn *fc = get_fuse_conn(inode);
993
994 /*
995 * In auto invalidate mode, always update attributes on read.
996 * Otherwise, only update if we attempt to read past EOF (to ensure
997 * i_size is up to date).
998 */
999 if (fc->auto_inval_data ||
1000 (iocb->ki_pos + iov_iter_count(to) > i_size_read(inode))) {
1001 int err;
1002 err = fuse_update_attributes(inode, iocb->ki_filp, STATX_SIZE);
1003 if (err)
1004 return err;
1005 }
1006
1007 return generic_file_read_iter(iocb, to);
1008 }
The 'getattr' will get the inode size related fields, which is zero.
And then when doing the generic_file_read_iter()-->filemap_read():
2628 do {
2629 cond_resched();
2630
2631 /*
2632 * If we've already successfully copied some data, then we
2633 * can no longer safely return -EIOCBQUEUED. Hence mark
2634 * an async read NOWAIT at that point.
2635 */
2636 if ((iocb->ki_flags & IOCB_WAITQ) && already_read)
2637 iocb->ki_flags |= IOCB_NOWAIT;
2638
2639 if (unlikely(iocb->ki_pos >= i_size_read(inode)))
2640 break;
2641
2642 error = filemap_get_pages(iocb, iter, &pvec);
2643 if (error < 0)
2644 break;
When it see that "iocb->ki_pos >= i_size_read(inode)" it will return directly without fire real read to libcephfs.
This is why when the read offset >= file size, we won't see the 'read' in libcephfs.
I have checked the direct IO in kernel & fuse code, also has the same logic.
Updated by Patrick Donnelly about 4 years ago
That explains the current behavior but why did this test not fail in the past?
Updated by Xiubo Li about 4 years ago
Patrick Donnelly wrote:
That explains the current behavior but why did this test not fail in the past?
Is it possible that the 'test_file' was created by other test cases and wasn't removed ? Then the 'test_file' was not empty ?
Updated by Venky Shankar about 4 years ago
Xiubo Li wrote:
Patrick Donnelly wrote:
That explains the current behavior but why did this test not fail in the past?
Is it possible that the 'test_file' was created by other test cases and wasn't removed ? Then the 'test_file' was not empty ?
That's possible. I've run into such things earlier where a previous test does not cleanup files and the subsequent test (depending on what its expecting) fails.
Updated by Xiubo Li about 4 years ago
Venky Shankar wrote:
Xiubo Li wrote:
Patrick Donnelly wrote:
That explains the current behavior but why did this test not fail in the past?
Is it possible that the 'test_file' was created by other test cases and wasn't removed ? Then the 'test_file' was not empty ?
That's possible. I've run into such things earlier where a previous test does not cleanup files and the subsequent test (depending on what its expecting) fails.
Okay, before I thought the previous test files should be cleaned up anyway.
Updated by Patrick Donnelly about 1 year ago
- Assignee changed from Xiubo Li to Patrick Donnelly
- Target version changed from v17.0.0 to v20.0.0
- Backport set to squid,reef
- Pull request ID changed from 44679 to 61675
Updated by Patrick Donnelly about 1 year ago
- Status changed from Fix Under Review to Pending Backport
Updated by Upkeep Bot about 1 year ago
- Copied to Backport #69957: reef: qa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) added
Updated by Upkeep Bot about 1 year ago
- Copied to Backport #69958: squid: qa: Test failure: test_pool_perm (tasks.cephfs.test_pool_perm.TestPoolPerm) added
Updated by Upkeep Bot about 1 year ago
- Tags (freeform) set to backport_processed
Updated by Upkeep Bot 8 months ago
- Merge Commit set to 2448960bc66a8065bb914ac135bb0fc4d283812b
- Fixed In set to v19.3.0-7579-g2448960bc66
- Upkeep Timestamp set to 2025-07-09T17:10:59+00:00
Updated by Upkeep Bot 8 months ago
- Fixed In changed from v19.3.0-7579-g2448960bc66 to v19.3.0-7579-g2448960bc6
- Upkeep Timestamp changed from 2025-07-09T17:10:59+00:00 to 2025-07-14T17:42:11+00:00
Updated by Kumar Hemanth 8 months ago
Updated by Milind Changire 6 months ago
Updated by Upkeep Bot 5 months ago
- Released In set to v20.2.0~1124
- Upkeep Timestamp changed from 2025-07-14T17:42:11+00:00 to 2025-11-01T00:58:45+00:00