unix.c: use posix_fallocate()#409
Conversation
|
Can one of the admins verify this patch? |
|
test this please |
lib/unix.c
Outdated
| free(buffer); | ||
| } | ||
|
|
||
| #endif |
There was a problem hiding this comment.
As discovered by @wladmis , a newline was lost unfortunately.
| #endif | |
| #endif | |
imz
left a comment
There was a problem hiding this comment.
The resulting patch is quite elegant and robust; on systems without posix_fallocate, the old code employing write would be at work as a fallback.
A sidenote: We have seen problems on linux because of write not actually allocating disk space. Perhaps, the problem is that linux optimizes write(fd, calloc(1, write_size), write_size) as creating a hole, because it sees that the pointer returned by calloc points to a zeroed page (also as a result of optimizing calloc).
Another optimization idea: further in the code, this fd is mmapped and zeroed with memset before usage. (Something to be checked more accurately!) Now, if we see that write(fd, calloc(1, write_size), write_size) is optimized by linux and works faster than memset (because we got the actual SIGBUS "no space" only when reaching the memset), we could stick to the write(fd, calloc(1, write_size), write_size) method of zeroing instead of memset if zeroing is needed by the logic of the code.
|
retest this please |
|
I'm not sure how much optimisation that code needs (though I am in favour of posix_fallocate() part, thank you) as it's only called at rb_open() time. |
Ok, so we shouldn't bother with optimizing Continuing to think about this: As far as I can see now, even doing
The old existing blocks in the file remain unzeroed, though. If we consider So, the zeroing used to happen when |
The only other uses of this function are from ipc_us.c, and there, too, or |
Using of posix_fallocate() guarantees that, if it succeed, the attempting to write to allocated space range does not fail because of lack of storage space. This prevents SIGBUS when trying to write to mmaped file and no space left. Co-Authored-by: Ivan Zakharyaschev <imz@altlinux.org> Reported-by: Mikhail Kulagin <m.kulagin at postgrespro dot ru>
imz
left a comment
There was a problem hiding this comment.
Yes, it looks more safe for the code after posix_fallocate to not invoke any write (as it was already done in the previous revision of the patch) and to not invoke even ftruncate (new in this revision) in order not to get the allocated blocks deallocated by the OS (by creating or "smartly" detecting holes).
|
retest this please |
|
Merged. Thanks very much! |
Using of posix_fallocate() guarantees that, if it succeed, the
attempting to write to allocated space range does not fail because of
lack of storage space. This prevents SIGBUS when trying to write to
mmaped file and no space left.
Co-Authored-by: Ivan Zakharyaschev imz@altlinux.org
Reported-by: Mikhail Kulagin <m.kulagin at postgrespro dot ru>