Discard out_channel buffered data on permanent I/O error by xavierleroy · Pull Request #12314 · ocaml/ocaml

xavierleroy · 2023-06-20T13:52:25Z

This PR explores the third approach mentioned here: #12300 (comment) . Implementing it led me to refactor some I/O code in the runtime system, but the new code is an improvement, if I may say so myself.

This PR is best reviewed commit-by-commit.

The third commit changes the flush operation over out_channel to discard the buffered data if the actual write fails with an EBADF ("bad file descriptor", including "the file descriptor was closed") or EPIPE ("pipe was closed at the other end") error. In both cases, retrying the write will fail, it's not a transient error condition like ENOSPC ("no space left on device"). So, before raising Sys_error, we just remove the buffered data.

This way, if the channel is finalized later, it will appear as empty (not containing unflushed data) and it will be freed immediately. This avoids the memory leak reported in #12300. It also avoids keeping the channel around and trying to flush it again when the program exits, which can lead to writing to the wrong file, as shown in #12300.

To implement this behavior, I had to change the interface of caml_read_fd and caml_write_fd. Currently, they return an error code for EINTR and directly raise a Sys_error exception for any other error. The second commit changes them to return -1 on any error and set errno to the error code, like a POSIX system call. It's now the caller's responsibility to act on the error, either by retrying, or by raising Sys_error, or by emptying an out_channel then raising Sys_error.

But there's a snag in the Windows implementation of caml_{read,write}_fd. They either call CRT functions, which leave POSIX-style error codes in errno, or Win32 socket functions, which produce Win32 error codes via WSAGetLastError(). So we need to convert from Win32 error codes to POSIX error codes. There's a _dosmaperr function in the CRT that does just this but is not exported. There's a caml_win32_maperr function in otherlibs/unix that does just this but we need it in the runtime. So, the first commit bites the bullet and moves the core of the caml_win32_maperr function from otherlibs/unix/unixsupport_win32.c to runtime/win32.c, where it can be used by caml_{read,write}_fd and also by caml_win32_rename. Phew!

xavierleroy · 2023-06-20T13:54:39Z

runtime/win32.c

-  case ERROR_CURRENT_DIRECTORY: case ERROR_BUSY:
-    errno = EBUSY; break;


The new code maps ERROR_CURRENT_DIRECTORY to EACCES, not EBUSY, for consistency with _dosmaperr and the Win32 Unix library. It's a small change to map it back to EBUSY, but I'm not sure it's worth departing from _dosmaperr here.

nojb

Did a first pass over the code, could only spot one potential issue; question below.

runtime/io.c

nojb

LGTM, could not spot any other issues.

nojb · 2023-06-22T08:01:47Z

One question is whether there are other error codes that should be handled in the same way as EBADF and EPIPE, but these are probably the most common ones.

xavierleroy · 2023-06-22T08:10:55Z

I went through man 2 write and other error codes look like

transient errors, just retry: EAGAIN EWOULDBLOCK EIO
recoverable conditions, just fix the real issue and retry: EDESTADDRREQ EDQUOT (barely), EFBIG (barely too), ENOSPC EPERM
a bug in the runtime system (EFAULT) or in the user's program (EINVAL).

runtime/win32.c

avsm · 2023-06-22T08:32:14Z

runtime/win32.c

+  { ERROR_LOCK_FAILED, 0, EACCES},
+  { ERROR_ALREADY_EXISTS, 0, EEXIST},
+  { ERROR_FILENAME_EXCED_RANGE, 0, ENOENT},
+  { ERROR_NESTING_NOT_ALLOWED, 0, EAGAIN},


Returning EAGAIN on ERROR_NESTING_NOT_ALLOWED doesn't seem to fit the "resource temporarily unavailable" meaning. Wouldn't this potentially cause an infinite loop, since retrying won't fix the fact that there are nested LoadModule invocations earlier in the callchain. But Python's win32_to_errno function also does the same mapping to EAGAIN, so I'm probably wrong (but not sure why).

(I read through these mappings while reading the rest of the diff; possibly better as a followup PR if a change is needed in these mappings)

The _dosmaperr function from the CRT maps ERROR_NESTING_NOT_ALLOWED to EAGAIN. As with ERROR_CURRENT_DIRECTORY, we could choose a different mapping, but do we really want to diverge from the CRT?

runtime/io.c

xavierleroy · 2023-06-28T15:37:59Z

@dra27 do you have an opinion on the Windows part of this PR, esp. the mapping from Win32 error codes to POSIX error codes?

Fixes: ocaml#12300

Use it in `caml_win32_rename`.

In case of error, they just return -1 after setting errno to an appropriate POSIX error code. The caller is responsible for processing the error. Typically, EINTR handles the signal and retries, other errors raise Sys_error.

… I/O error So that the out_channel can be reclaimed by finalization.

Fixes: ocaml#12300

xavierleroy · 2023-07-12T14:31:16Z

Re-based, re-tested, and merged. The Win32 part, and especially the mapping of Win32 error codes, can still be discussed, I'm all ears.

Fixes: ocaml#12300

jmid · 2024-01-19T12:35:36Z

I believe this PR introduced a regression, see #12898 for details.

xavierleroy commented Jun 20, 2023

View reviewed changes

nojb reviewed Jun 22, 2023

View reviewed changes

runtime/io.c Outdated Show resolved Hide resolved

runtime/io.c Outdated Show resolved Hide resolved

nojb approved these changes Jun 22, 2023

View reviewed changes

avsm reviewed Jun 22, 2023

View reviewed changes

runtime/win32.c Show resolved Hide resolved

avsm reviewed Jun 22, 2023

View reviewed changes

runtime/io.c Outdated Show resolved Hide resolved

xavierleroy added a commit to xavierleroy/ocaml that referenced this pull request Jun 28, 2023

Changes entry for ocaml#12314

f2f8a2c

Fixes: ocaml#12300

xavierleroy force-pushed the io-error-handling branch from 79cb8b4 to f2f8a2c Compare June 28, 2023 16:21

Octachron added the windows label Jul 12, 2023

damiendoligez added merge-me windows and removed windows labels Jul 12, 2023

xavierleroy added 4 commits July 12, 2023 15:28

Move Win32 -> POSIX error code conversion to runtime/win32.c

d688fc3

Use it in `caml_win32_rename`.

caml_read_fd and caml_write_fd no longer raise exceptions

535e905

In case of error, they just return -1 after setting errno to an appropriate POSIX error code. The caller is responsible for processing the error. Typically, EINTR handles the signal and retries, other errors raise Sys_error.

Discard buffered data if flush runs into a permanent, non-recoverable…

9f2649c

… I/O error So that the out_channel can be reclaimed by finalization.

Changes entry for ocaml#12314

ed4d191

Fixes: ocaml#12300

xavierleroy force-pushed the io-error-handling branch from 9bfa1b5 to ed4d191 Compare July 12, 2023 13:30

xavierleroy merged commit 007c040 into ocaml:trunk Jul 12, 2023

xavierleroy deleted the io-error-handling branch July 12, 2023 14:31

NickBarnes pushed a commit to NickBarnes/ocaml that referenced this pull request Jul 14, 2023

Changes entry for ocaml#12314

3bf0a8a

Fixes: ocaml#12300

jmid mentioned this pull request Jan 15, 2024

Regression on Out_channel exceptions #12898

Closed

gasche mentioned this pull request Jan 29, 2024

Consistent errors on closed channels #12947

Closed

jmid mentioned this pull request Feb 6, 2024

Free channel buffers on close rather than leaving them to the GC. #12678

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discard out_channel buffered data on permanent I/O error#12314

Discard out_channel buffered data on permanent I/O error#12314
xavierleroy merged 4 commits intoocaml:trunkfrom
xavierleroy:io-error-handling

xavierleroy commented Jun 20, 2023

Uh oh!

xavierleroy Jun 20, 2023

Uh oh!

nojb left a comment

Uh oh!

Uh oh!

Uh oh!

nojb left a comment

Uh oh!

nojb commented Jun 22, 2023

Uh oh!

xavierleroy commented Jun 22, 2023

Uh oh!

Uh oh!

avsm Jun 22, 2023

Uh oh!

xavierleroy Jun 22, 2023

Uh oh!

Uh oh!

xavierleroy commented Jun 28, 2023

Uh oh!

xavierleroy commented Jul 12, 2023

Uh oh!

jmid commented Jan 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

		case ERROR_CURRENT_DIRECTORY: case ERROR_BUSY:
		errno = EBUSY; break;

Conversation

xavierleroy commented Jun 20, 2023

Uh oh!

xavierleroy Jun 20, 2023

Choose a reason for hiding this comment

Uh oh!

nojb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nojb left a comment

Choose a reason for hiding this comment

Uh oh!

nojb commented Jun 22, 2023

Uh oh!

xavierleroy commented Jun 22, 2023

Uh oh!

Uh oh!

avsm Jun 22, 2023

Choose a reason for hiding this comment

Uh oh!

xavierleroy Jun 22, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xavierleroy commented Jun 28, 2023

Uh oh!

xavierleroy commented Jul 12, 2023

Uh oh!

jmid commented Jan 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants