Skip to content

posix_fallocate can be interrupted by a signal #451

@shastah

Description

@shastah

Every now and then, use of any Pacemaker cmdline tool fails and results in Pacemaker logging errors:

  (pcmk__new_client)        debug: New IPC client 3efdbecf-c2d9-44bc-b4a6-9bcd48021ba1 for PID 27492 with uid 0 and gid 0
  (handle_new_connection)   debug: IPC credentials authenticated (/dev/shm/qb-7271-27492-12-hfPbKY/qb)
  (qb_ipcs_shm_connect)     debug: connecting to client [27492]
  (qb_rb_open_2)    debug: shm size:524301; real_size:528384; rb->word_size:132096
  (qb_rb_open_2)    debug: shm size:524301; real_size:528384; rb->word_size:132096
  (qb_sys_mmap_file_open)   error: couldn't allocate file /dev/shm/qb-7271-27492-12-hfPbKY/qb-event-cib_rw-data: Interrupted system call (4)
  (qb_rb_open_2)    error: couldn't create file for mmap
  (qb_ipcs_shm_rb_open)     error: qb_rb_open:/dev/shm/qb-7271-27492-12-hfPbKY/qb-event-cib_rw: Interrupted system call (4)
  (qb_rb_close_helper)      debug: Free'ing ringbuffer: /dev/shm/qb-7271-27492-12-hfPbKY/qb-response-cib_rw-header
  (qb_rb_close_helper)      debug: Free'ing ringbuffer: /dev/shm/qb-7271-27492-12-hfPbKY/qb-request-cib_rw-header
  (qb_ipcs_shm_connect)     error: shm connection FAILED: Interrupted system call (4)
  (handle_new_connection)   error: Error in connection setup (/dev/shm/qb-7271-27492-12-hfPbKY/qb): Interrupted system call (4)

The way we worked around it is with a silly "retry" logic in qb_sys_mmap_file_open(). Not sure if this is the best solution, but it proved to be "good enough":

--- a/lib/unix.c	2020-07-29 08:37:35.000000000 +0200
+++ b/lib/unix.c	2021-01-28 21:01:43.984432610 +0100
@@ -79,6 +79,9 @@
 	ssize_t written;
 	char *buffer = NULL;
 	char *is_absolute = strchr(file, '/');
+#ifdef HAVE_POSIX_FALLOCATE
+	int32_t fallocate_retry = 5;
+#endif
 
 	if (is_absolute) {
 		(void)strlcpy(path, file, PATH_MAX);
@@ -116,12 +119,22 @@
 	}
 
 #ifdef HAVE_POSIX_FALLOCATE
-	if ((res = posix_fallocate(fd, 0, bytes)) != 0) {
-		errno = res;
-		res = -1 * res;
-		qb_util_perror(LOG_ERR, "couldn't allocate file %s", path);
-		goto unlink_exit;
-	}
+	/* posix_fallocate(3) can be interrupted by a signal,
+	   so retry few times before giving up */
+	do {
+		fallocate_retry--;
+		res = posix_fallocate(fd, 0, bytes);
+		if (res == EINTR) {
+			qb_util_log(LOG_DEBUG, "got EINTR trying to allocate file %s, retrying...", path);
+			continue;
+		} else if (res != 0) {
+			errno = res;
+			res = -1 * res;
+			qb_util_perror(LOG_ERR, "couldn't allocate file %s", path);
+			goto unlink_exit;
+		}
+		break;
+	} while (fallocate_retry > 0);
 #else
 	if (file_flags & O_CREAT) {
 		long page_size = sysconf(_SC_PAGESIZE);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions