[RFC] With_resource by gadmm · Pull Request #8962 · ocaml/ocaml

gadmm · 2019-09-22T19:28:12Z

This is a draft design proposal for a new "unwind-protect" abstraction for resource management called Fun.with_resource, inspired by Haskell's Control.Exception.bracket (see Marlow et al., Asynchronous exceptions in Haskell). It guarantees that the release function gets executed upon exit if and only if acquisition returned normally, and that no asynchronous exception arises during acquisition and release. Thanks to these guarantees, with_resource can be used to build scoped resource-management combinator (cf. with_ idiom) by supplying appropriate acquire and release functions.

More precisely, for Fun.with_resource ~acquire ~release to bring resource-safety guarantees, the programmer needs to fulfill the following conditions:

acquire either succeeds, or if it fails it does so by raising an exception, without having acquired the resource, for instance by undoing changes;
release never fails to release the resource, in particular it never raises.

Disabling asynchronous exceptions is crucial to let the user enforce the conditions. (These guarantees are called respectively strong exception-safety and no-raise guarantee, see Abrahams.)

Relationship with Fun.protect: Fun.with_resource does not replace Fun.protect. The latter does not have the right signature to properly implement the masking during acquisition. This remark highlighting a potential defect of Fun.protect led at the time to an audit of usages of unwind-protect combinators (in the compiler codebase), which led to the conclusion that both are complementary and useful. Fun.protect, as its documentation currently indicates, allows the execution of arbitrary code to enforce local invariants, with examples from the compiler codebase involving exceptions used for control flow.

The first commits have to do with masking and are proposed separately there: #8961. This is to discuss the last 3 commits which live on top of them.

Outside of the technical details of masking to be discussed in #8961, there is the question of how to handle exceptions arising from release (i.e., bugs). This is delicate. The purpose of this combinator is to provide some guarantees (resource safety) in case of any serious and unexpected exceptions (or expected asynchronous interruption) arising during the execution of a program, so that the problem remains isolated and does not bring down the outside world with it. But if an exception arises from release, then the code meant to enforce this guarantee itself has failed, and there is no meaningful way to continue the program. To avoid introducing a new kind of super-exception somehow "worse" than other exceptions (and for which there is no programmatically meaningful way to handle it, even for hardening purposes), the uncaught exceptions in release are handled with an uncaught exception handler which by default outputs a fatal error and exit(2). Learning from previous discussions on the topic, this would be better treated with a customisable handler where the user can decide to log and/or ignore the exception. (This was implemented in a previous version of this PR, but I preferred to simplify it during the rebasing given the lack of traction on the related PR #8963.)

This PR being the user-facing part of #8961, the following remains to be done:

Let the user customise behaviour with exceptions-in-destructor.
Fix the default behaviour of uninterruptible EINTR-loops in io.c/unix.c when the mask is set and provide customisable behaviour.
Implement some with_ functions in the library, e.g. files, mutexes...

gasche · 2019-11-07T14:13:08Z

(This looks like something for @jhjourdan and @stedolan to look at.)

gadmm · 2019-11-07T14:51:48Z

But wait, both this and #8961 are on hold until they are rebased. The code changed a lot with the other PRs that passed.

If an action is pending in C code, young_limit forces to take the expensive path every time, until the pending action is cleared. To avoid this, we check against a new limit young_limit_c instead of young_limit in Alloc_small_aux from now on, used to record the next implicit or requested major slice and minor collection. To make sure memprof samples the blocks, we record caml_memprof_young_trigger in young_limit_c. * I checked with Godbolt that the new branching inside Alloc_small_limit is erased at compilation time. This is true for all compilers proposed by Godbolt (including old ones) except ppci and cc65. In addition, all compilers give identical output as if there was no branching with -O0, except msvc which sometimes required -O1 to remove a noop. * There is a new call to `caml_notify_action` inside `caml_alloc_small_dispatch` via `caml_set_action_pending` that fixes the behaviour where action_pending might be set but the program not notified when returning to OCaml code. This is also used to stop exporting caml_something_to_do later on. Without the rest of this commit, the change would slow down all C allocations up to the next explicit poll.

* Preparations for masking. * `caml_something_to_do` was declared as internal but was exported and used in ocamlcc (among all opam packages). This usage in ocamlcc is superseded by the official interface `caml_process_pending_actions_exn`, and `caml_check_pending_actions`.

This is a flag to control the delaying of asynchronous callbacks (signal handlers, finalisers and memprof callbacks). We distinguish two kinds of asynchronous actions: - unrestricted asynchronous callbacks: possibly-raising signal handlers, memprof callbacks and finalisers (including GC alarms). - non-raising actions, such as non-raising signal handlers (e.g. systhread's yield). (The latter is set as a non-raising handler in an ulterior commit.) Masking temporarily delays the execution of asynchronous callbacks. New functions caml_mask and caml_unmask to control which kind of asynchronous callbacks are delayed.

So, it will no longer be delayed by Sys.mask. For simplicity and to avoid changing the definition of Sys.signal_behavior, "non-raising" signal handlers are not exposed through the Sys module, but are accessible through the new primitive.

This polling was introduced at 51c55b2 to "preserve the semantics of the program w.r.t. exceptions". The implementation was a bit hackish since it needed to poll again right after executing callbacks. This is no longer needed for this purpose now that minor allocations poll in bytecode (since 4.10). Furthermore, a signal arising right before the instruction cannot be distinguished from a signal arising just after. Removing this polling simplifies the code, but also prevents looping when recording a signal from a signal handler (which is very useful to purposefully explore all polling locations of a function, e.g. in testing, as used in subsequent commits).

Remove caml_process_pending_signals_exn, replaced by calls to actions. Then remove signals_are_pending which was used as a cache to make the test inside caml_enter_blocking_section efficient. As a side-effect, this permits to record a signal from a signal handler, very useful to have robust tests. We add such a test as a control for tests in subsequent commits.

It tests a possible future Fun.with_resource similar to Haskell's bracket.

gadmm · 2021-01-21T14:54:48Z

It is now rebased, but is still a draft. Only the last 3 commits belong to this PR, the other ones are from #8961. The latter PR is more urgent.

edwintorok · 2022-12-25T12:44:57Z

stdlib/fun.mli

+
+val with_resource :
+  acquire:('a -> 'r) -> 'a -> scope:('r -> 'b) -> release:('r -> unit) -> 'b
+(** [with_resource ~acquire x work ~release] invokes [acquire x], then


Should be: [with_resource ~acquire x ~scope ~release] to match the type signature.

edwintorok · 2022-12-25T12:45:26Z

stdlib/fun.mli

+val with_resource :
+  acquire:('a -> 'r) -> 'a -> scope:('r -> 'b) -> release:('r -> unit) -> 'b
+(** [with_resource ~acquire x work ~release] invokes [acquire x], then
+    invokes [work] on the resulting value, and then invokes [release]


scope instead of work if you intend to follow the type signature

edwintorok · 2022-12-25T12:45:34Z

stdlib/fun.mli

+  acquire:('a -> 'r) -> 'a -> scope:('r -> 'b) -> release:('r -> unit) -> 'b
+(** [with_resource ~acquire x work ~release] invokes [acquire x], then
+    invokes [work] on the resulting value, and then invokes [release]
+    on the value, whether [work] returns or raises an exception. The


edwintorok · 2022-12-25T12:45:39Z

stdlib/fun.mli

+(** [with_resource ~acquire x work ~release] invokes [acquire x], then
+    invokes [work] on the resulting value, and then invokes [release]
+    on the value, whether [work] returns or raises an exception. The
+    result of [work] is then produced, whether it is a value or an


edwintorok · 2022-12-25T14:46:58Z

Better resource handling in OCaml would be great, however both Fun.protect and Fun.with_resource are difficult to use correctly because closing a file descriptor may raise exceptions (and ignoring them may not be the best choice, e.g. when closing the fd on the success path then raising an exception is what you want, and when closing it on the error path not).

Nevertheless with a few helper functions I think it can be used safely, here is approximately how:

let with_file_in ~acquire path f =
    with_resource ~acquire ~scope:f ~release:close_in_noerr path

(** [release_out_exn ch] is like [close_out] but uses [close_out_noerr]
    if the implicit [flush_out] in [clouse_out] fails.
    This is safe because the OCaml runtime only calls [close] once
    even if [close_out] is called multiple times.

    This function is suitable to be passed to `with_sys_resource ~release_exn`.

    Note that this does not guarantee that the underlying file descriptor is closed
    on some OSes, however see https://www.austingroupbugs.net/view.php?id=529
    which talks about POSIX_CLOSE_RESTART and see Emacs's emulation of it:
    https://www.mail-archive.com/bug-gnulib@gnu.org/msg34889.html
*)
let release_out_exn ch =
    with_resource
        ~acquire:Fun.id
        ~release:close_out_noerr
        ~scope:close_out ch

let with_sys_resource ~acquire ~scope ~release_exn =
    let should_release = ref true in
    let scope ch =
        let r = scope ch in
        (* not strictly needed for close_out, but will be needed by Unix.close *)
        should_release := false;
        release_exn ch;
        r
    in
    let release ch =
        if !should_release then
        try release_exn ch with _ -> ()
    in
    with_resource
      ~acquire
      ~release
      ~scope

let with_file_out ~acquire f =
    with_sys_resource
      ~acquire
      ~release_exn:release_out_exn
      ~scope:f

let with_temp_file ?mode ?perms ?temp_dir write handle_path =
    (* this one creates 2 resources: a file in the FS and an open FD in the program *)
    (* handle_path could do a rename+fsync for example *)
    let acquire (prefix, suffix) = Filename.open_temp_file ?mode ?perms ?temp_dir prefix suffix in
    with_sys_resource
        ~acquire
        ~release_exn:(fun (path, _) -> Sys.remove path)
        ~scope:(fun (path, ch) ->
            with_file_out ~acquire:Fun.id write ch;
            handle_path path
        )

let unix_with_file ~acquire f =
    with_sys_resource
        ~acquire ~scope:f
        ~release_exn:Unix.close

The with_sys_resource deals with how a release function looks like in practice and calls it the appropriate number of times.

However having to use a 'ref' to implement it might indicate that the underlying Fun.with_resource is not flexible enough
(or too low level). Perhaps having a release_on_success_exn and release_on_failure_noerr might be better
instead of a single release on Fun.with_resource?

(or the custom error handler might be another way, either way trying out how the API would be used can both provide an example to user on how to use it correctly, and alleviate the need to complicate the lower level API with error handling, e.g. implementing Fun.with_resource as you did here might be enough, if there is also a Fun.with_sys_resource with the error handling semantics implemented on top of it).
Hope this helps.

gadmm mentioned this pull request Sep 22, 2019

Caml_fatal_user_error #8963

Open

gadmm mentioned this pull request Oct 1, 2019

Guarantee that allocation functions do no trigger callbacks when called from C #8691

Merged

gadmm mentioned this pull request Dec 21, 2019

Memprof new api #8920

Merged

gadmm mentioned this pull request Dec 2, 2020

Masking of asynchronous callbacks #8961

Open

gadmm added 16 commits January 20, 2021 23:00

Factor async notification delivery logic with notify_action

667008d

Primitive to test signals

bbbdf10

[minor] Documentation

2eee8f5

Register systhread's tick as non-raising

b8f719c

So, it will no longer be delayed by Sys.mask. For simplicity and to avoid changing the definition of Sys.signal_behavior, "non-raising" signal handlers are not exposed through the Sys module, but are accessible through the new primitive.

Prevent all asynchronous exceptions during uncaught exception handling

6ec6922

Function Sys.mask to run with asynchronous callbacks delayed

5a5d89f

Import tests from the future

55bd7c5

It tests a possible future Fun.with_resource similar to Haskell's bracket.

Add Sys.nonpreemptible_mask

1e3e4ea

Adapt tests for bytecode

ee80da4

Fix tests for Windows?

ab4bd7f

gasche's comments

0480c9c

gadmm force-pushed the with_resource branch from 4fe1bed to 57ebb52 Compare January 20, 2021 22:07

gadmm added 3 commits January 21, 2021 15:22

Make Printexc.get_raw_backtrace non-raising

277693d

Fun.with_resource

5bec330

Fix with_resource for bytecode by making it less clear

1554354

gadmm force-pushed the with_resource branch from 57ebb52 to 1554354 Compare January 21, 2021 14:23

gadmm mentioned this pull request Jun 18, 2021

Add [@poll error] attribute #10462

Merged

gadmm mentioned this pull request Jul 1, 2022

Exception raising Stdlib.at_exit callback breaks Stdlib.exit #11221

Open

edwintorok reviewed Dec 25, 2022

View reviewed changes

redianthus added the stdlib label Jan 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] With_resource#8962

[RFC] With_resource#8962
gadmm wants to merge 19 commits intoocaml:trunkfrom
gadmm:with_resource

gadmm commented Sep 22, 2019 •

edited

Loading

Uh oh!

gasche commented Nov 7, 2019

Uh oh!

gadmm commented Nov 7, 2019

Uh oh!

gadmm commented Jan 21, 2021

Uh oh!

edwintorok Dec 25, 2022

Uh oh!

edwintorok Dec 25, 2022

Uh oh!

edwintorok Dec 25, 2022

Uh oh!

edwintorok Dec 25, 2022

Uh oh!

edwintorok commented Dec 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gadmm commented Sep 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gasche commented Nov 7, 2019

Uh oh!

gadmm commented Nov 7, 2019

Uh oh!

gadmm commented Jan 21, 2021

Uh oh!

edwintorok Dec 25, 2022

Choose a reason for hiding this comment

Uh oh!

edwintorok Dec 25, 2022

Choose a reason for hiding this comment

Uh oh!

edwintorok Dec 25, 2022

Choose a reason for hiding this comment

Uh oh!

edwintorok Dec 25, 2022

Choose a reason for hiding this comment

Uh oh!

edwintorok commented Dec 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gadmm commented Sep 22, 2019 •

edited

Loading