Retry if posix_fallocate is interrupted with EINTR#453
Merged
chrissie-c merged 1 commit intoClusterLabs:mainfrom Jan 14, 2022
Merged
Retry if posix_fallocate is interrupted with EINTR#453chrissie-c merged 1 commit intoClusterLabs:mainfrom
chrissie-c merged 1 commit intoClusterLabs:mainfrom
Conversation
Every now and then Pacemaker reports errors: (pcmk__new_client) debug: New IPC client 3efdbecf-c2d9-44bc-b4a6-9bcd48021ba1 for PID 27492 with uid 0 and gid 0 (handle_new_connection) debug: IPC credentials authenticated (/dev/shm/qb-7271-27492-12-hfPbKY/qb) (qb_ipcs_shm_connect) debug: connecting to client [27492] (qb_rb_open_2) debug: shm size:524301; real_size:528384; rb->word_size:132096 (qb_rb_open_2) debug: shm size:524301; real_size:528384; rb->word_size:132096 (qb_sys_mmap_file_open) error: couldn't allocate file /dev/shm/qb-7271-27492-12-hfPbKY/qb-event-cib_rw-data: Interrupted system call (4) (qb_rb_open_2) error: couldn't create file for mmap (qb_ipcs_shm_rb_open) error: qb_rb_open:/dev/shm/qb-7271-27492-12-hfPbKY/qb-event-cib_rw: Interrupted system call (4) (qb_rb_close_helper) debug: Free'ing ringbuffer: /dev/shm/qb-7271-27492-12-hfPbKY/qb-response-cib_rw-header (qb_rb_close_helper) debug: Free'ing ringbuffer: /dev/shm/qb-7271-27492-12-hfPbKY/qb-request-cib_rw-header (qb_ipcs_shm_connect) error: shm connection FAILED: Interrupted system call (4) (handle_new_connection) error: Error in connection setup (/dev/shm/qb-7271-27492-12-hfPbKY/qb): Interrupted system call (4) While it probably might be addressed in Pacemaker code, a simple retry loop in case posix_fallocate(3) returns EINTR seems to be a decent workaround. Fixes: ClusterLabs#451 Signed-off-by: Jakub Jankowski <shasta@toxcorp.com>
|
Can one of the admins verify this patch? |
Member
|
Quite some users are encountering the same issue, where requests to pacemaker-based daemon fail on IPC at times. It seems sensible to do retrying under the situation. For the case of ! HAVE_POSIX_FALLOCATE in below, a noticeable difference is, if the situation continues occurring, it keeps retrying write() rather than for up to a limited amount of times though. So a question is if it makes sense to do the same with posix_fallocate(). |
Contributor
|
test this please |
chrissie-c
approved these changes
Jan 14, 2022
Contributor
chrissie-c
left a comment
There was a problem hiding this comment.
Thanks, that looks good to me.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Every now and then Pacemaker reports errors:
While it probably might be addressed in Pacemaker code, a simple retry
loop in case posix_fallocate(3) returns EINTR seems to be a decent
workaround.
Fixes: #451
Signed-off-by: Jakub Jankowski shasta@toxcorp.com