Skip to content

Commit c703cf9

Browse files
committed
osd/bluestore: Actually wait until completion in write_sync
This function is only used by RocksDB WAL writing so it must sync data. This fixes ceph#18338 and thus allows to actually set `bluefs_preextend_wal_files` to true, gaining +100% single-thread write iops in disk-bound (HDD or bad SSD) setups. To my knowledge it doesn't hurt performance in other cases. Test it yourself on any HDD with `fio -ioengine=rbd -direct=1 -bs=4k -iodepth=1`. Issue ceph#18338 is easily reproduced without this patch by issuing a `kill -9` to the OSD while doing `fio -ioengine=rbd -direct=1 -bs=4M -iodepth=16`. Fixes: https://tracker.ceph.com/issues/18338 https://tracker.ceph.com/issues/38559 Signed-off-by: Vitaliy Filippov <vitalif@yourcmc.ru>
1 parent f6fd7b1 commit c703cf9

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

src/os/bluestore/KernelDevice.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -762,8 +762,8 @@ int KernelDevice::_sync_write(uint64_t off, bufferlist &bl, bool buffered, int w
762762
}
763763
#ifdef HAVE_SYNC_FILE_RANGE
764764
if (buffered) {
765-
// initiate IO (but do not wait)
766-
r = ::sync_file_range(fd_buffereds[WRITE_LIFE_NOT_SET], off, len, SYNC_FILE_RANGE_WRITE);
765+
// initiate IO and wait till it completes
766+
r = ::sync_file_range(fd_buffereds[WRITE_LIFE_NOT_SET], off, len, SYNC_FILE_RANGE_WRITE|SYNC_FILE_RANGE_WAIT_AFTER|SYNC_FILE_RANGE_WAIT_BEFORE);
767767
if (r < 0) {
768768
r = -errno;
769769
derr << __func__ << " sync_file_range error: " << cpp_strerror(r) << dendl;

0 commit comments

Comments
 (0)