Skip to content

Conversation

@gadmm
Copy link
Contributor

@gadmm gadmm commented Jul 22, 2023

Dusting-off old commits, as a preliminary change to fix the error handling during the spawning of domains (#12269)...

This PR introduces _exn variants of the functions that do not immediately raise the exception, allowing for correct resource cleanup.

Another motivating example is given at #7423:

caml_invalid_argument(str) is no return and does not free it's argument. So calling it with a string constructed dynamically will mean it'll never get freed.

This led at the time to the introduction of caml_invalid_argument_value and caml_failwith_value that accept an
OCaml string as argument, as a way to work around the lack of control on resource management. But this is a more general problem, for which the present commit gives a general solution that fits the rest of the resource-safe API.

This is best reviewed commit-per-commit.

cc @kayceesrk

@xavierleroy
Copy link
Contributor

As you documented, these exceptions encoded as values by or-ing them with 2 will crash the GC and, therefore, cannot survive an allocation. I think this is too dangerous to be exposed in the FFI. (We've had enough problems with caml_callback*_exn already.) So, perhaps CAML_INTERNALS only?

Or perhaps you expose functions that return normal exception values (these could be useful in other situations, e.g. for other encodings of value-or-exception value) and let the caller use Make_exception_value if they dare. Here, only Make_exception_value would be CAML_INTERNALS and undocumented.

@gadmm
Copy link
Contributor Author

gadmm commented Jul 23, 2023

Taking a step back, this belongs to a project to fix the runtime wrt. async exceptions but also learn from it to improve the FFI (with C resource-management idioms in mind). Hence why making it non-internal.

As a general goal,

  1. the raising status of all FFI functions should be documented,
  2. for the non-internal ones that can raise, one should provide an _exn variant whenever relevant.

This is not the first patch going in this direction, cf. caml_process_pending_actions_exn and the previous proposal for caml_enter_blocking_section_exn (pending rebasing). Also, it became more urgent with the OCaml-Rust FFI, since one should never call a raising-into-OCaml function there. For Rust the safety of the "value or exn" type is not an issue since its type system is expressive enough.

So something like this at least is needed for users, not just for the runtime.

You seem to be discussing the general direction, so it might be a good time to do it. We can explore better alternatives to the xxx10 bit pattern or look into improving it.

  • My point of view is that the priority is to make things possible for expert users, before thinking about how beginners might misuse the thing.
  • The general rule of always calling Extract_exception immediately is not hard to apply.
  • There is an added value of having a single consistent style and practice throughout the FFI.
    DetailsThe style was consistent throughout the runtime at some point, but currently grepping for _exn shows that the naming scheme was not respected during multicore development. But this can be fixed, given this only concerns internal names.
  • The current practice could be improved in various ways:
    • one could introduce a new return type, e.g. value_or_exn, to denote an irregular value (via a type alias as needed for backwards-compatibility; no help from the compiler possible).
    • it is likely cheap to harden the runtime against those bugs, for better error-reporting.
  • The _exn approach could be phased-out in favour of a better-typed alternative (with various options to explore). However a concrete proposal is needed before taking decisions based on this idea.

I am curious to hear your thoughts for the general problems of exceptional return in the FFI rather than for caml/fail.h taken in isolation. My current thought is that the current approach is good enough and has an added value of being already established. The suggestion for making the current PR return normal values makes sense to me in isolation, but the rationale of not having more _exn functions in the future does not, as it stands.

@gasche
Copy link
Member

gasche commented Jul 23, 2023

one could introduce a new return type, e.g. value_or_exn, to denote an irregular value (via a type alias as needed for backwards-compatibility; no help from the compiler possible).

Yes, I was thinking of this today when reviewing your Domain.spawn PR. In fact we could even call it unsafe_value_or_exn or temporary_value_or_exn to emphasize the GC-invalidity issue.

@xavierleroy
Copy link
Contributor

I'm fine with using "exception-returning" style (return the exception describing the error instead of raising it on the spot) within the runtime system first: this has the potential to fix some nasty issues, and it will give us experience with this new approach, e.g. whether the current value-or-exception encoding is usable despite its limitations, or another representation is needed. But I think it's too early to change the public FFI in a way that hard-wires the current value-or-exception encoding. That's why I would much prefer expose functions that construct and return plain exception values, because they are just as useful as the functions that return value-or-exception encodings for the exceptions in question, but do not expose the encoding.

@gadmm
Copy link
Contributor Author

gadmm commented Oct 5, 2023

I now have both implementations, the one embracing the _exn convention and one building regular values. They amount to the same, and it is not clear that one is better than the other (having to use Extract_exception to build regular exception values vs. having to document and use Make_exception_result for now).

I would like to mention that the concern about encoded exceptions was not raised during any of our previous discussions on the topic. Adding new public-facing _exn functions has already been done recently, following the value-or-exception convention that was already public and documented at the time. The present work to add new _exn variants traces back to OCaml 4.10, and adding more of them for the sake of FFI resource-safety was clearly announced as an ultimate goal in the PRs #9037, #8993, and #8997, as well as in a roadmap requested by Gabriel for my work on the runtime that was forwarded to the core OCaml developers a few years ago. The PRs were extensively commented, and the only feedback for the roadmap I received at the time was from Leo White—no concerns about potential confusions about encoded exceptions were raised. Did I miss something that occurred since?

Four years later, moving forward with this work is certainly not too early. We are no longer looking to experiment, because these non-raising variants are necessary for making the OCaml-Rust interoperability memory-safe. With the backwards-compatible proposition to move to a new alias for intnat, the risk of confusing encoded exceptions with the value type is similar at a technical level to confusing value and intnat. I am eager to hear your opinion on whether this change is enough, or what your proposal would be to move forward with this problem. Thanks for any input you might have, the Rust interface can adapt to any concrete proposal.

@gasche
Copy link
Member

gasche commented Feb 20, 2024

My naive understanding of the situation is the following:

  1. This is about how we represent "either an OCaml value or an exception that should be raised to the caller".
  2. Some parts of the runtime system use a hack to encode such payload as a value, with dedicated macros for this.
  3. But we could also make this a proper, distinct, tagged union value_or_exn type.
  4. Whatever the type we use, we want combinators to build the exceptions instead of raising them.

Xavier objects to wider usage of the ugly encoding in (2). Could we instead explore (3), that is a proper type that is distinct from value? (Or maybe we would immediately run into issues because we really want to easily cross the OCaml/C FFI boundary, which can only exchange values?)

@damiendoligez
Copy link
Member

  1. But we could also make this a proper, distinct, tagged union value_or_exn type.

Xavier objects to wider usage of the ugly encoding in (2). Could we instead explore (3), that is a proper type that is distinct from value? (Or maybe we would immediately run into issues because we really want to easily cross the OCaml/C FFI boundary, which can only exchange values?)

We already have such a type: ('a, exception) result. It is a value, and it can be easily handled by OCaml code without adding new primitives. Now the question is, do we want to use heap-allocated blocks for this encoding, or do we want to stick to a C type (and malloc blocks?) for the FFI?

As far as naming is concerned, the choice of the _exn suffix is unfortunate because in the FFI it means "returns a tagged union instead of raising" while in the stdlib it means exactly the opposite.

@gasche
Copy link
Member

gasche commented Mar 6, 2024

My understanding is that the types being discussed are for C code -- C callers and C callees -- so we do not need to allocate the result on the OCaml heap, we could for example use a C tagged union. We would actually get more type-safety this way, as we would have a more descriptive type than value.

@gasche
Copy link
Member

gasche commented Mar 6, 2024

In #13013 I propose a caml_result type that is distinct from value and represents either a returned value or a raised exception. If we can agree on (some variant of) that design, we could move forward with more resource-safe functions in the runtime.

@gadmm
Copy link
Contributor Author

gadmm commented Mar 15, 2024

@gasche proposes to implement separately the suggested approach in the fifth bullet point of my comment #12407 (comment). I think it goes in a right direction and I hope it will keep momentum.

So I propose here a new version of this PR which introduces functions called caml_exception_* that return the (non-encoded) exception as a value, that are not behind CAML_INTERNALS, which is also one approach proposed by @xavierleroy (with the minor difference that I am not changing Make_exception_result to be hidden behind CAML_INTERNALS). It can be considered independently from @gasche's proposal and it will integrate nicely with it once/if it is accepted.

@gadmm gadmm force-pushed the exnize_fail_h branch from fa2a97a to 691c1d6 Compare May 7, 2024 13:55
Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this and I think it is fine. (I like the commit factorizing shared code between fail_byt and fail_nat!) There will be some rebase work to be done on top of caml_result, unfortunately.

This introduces a new naming convention caml_exception_<foo> for the function constructing an exception value whose constructor is foo, as a variant of caml_raise_<foo>. Before we already had caml_exn_<foo> which is the exception constructor for that expression (without arguments), and so for constant exceptions caml_exn_foo and caml_exception_foo happen to be the same thing. This is very slightly awkward but I think it is overall fine and I don't have a better proposal to make, so I propose to go ahead.


/* Non-raising variants of the above functions. The exception is
returned as a normal value, which can be raised with [caml_raise],
typically after cleaning-up resources. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to also suggest Result_exception to return a caml_result instead of raising.

runtime/fail.c Outdated
CAMLno_asan
CAMLexport void caml_raise_constant(value tag)
#define Assert_is_tag(tag) \
(CAMLassert(Is_block(tag) && Tag_val(tag) == Object_tag))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the naming confusing because tag does not mean the same thing in Assert_is_tag / tag and in Tag_val / Object_tag. I understand that you are preserving the naming existing in the file, but I wonder if we could improve it to avoid the confusion. Other parts of the codebase (for example: lambda/translcore.ml:assert_failed) refer to this "tag" as a "slot", which is still unclear but at least does not conflict with the other meaning of "tag"; and then for extensible variants we call this the "extension constructor" which I find a much clearer name. I would find Assert_is_exn_constructor(constr) (or exn_constr) very clear -- and then constr or exn_constr as the variable name instead of tag below.

gadmm and others added 3 commits June 9, 2024 00:56
Simplification for subsequent commits.
Introduce variants of the functions that do not immediately raise the
exception, allowing for resource cleanup.

This is a preliminary change used to fix the error handling during the
spawning of domains.

Another motivating example is given at ocaml#7423:

> caml_invalid_argument(str) is no return and does not free it's
argument. So calling it with a string constructed dynamically will
mean it'll never get freed.

This led at the time to the introduction of
[caml_invalid_argument_value] and [caml_failwith_value] that accept an
OCaml string as argument, as a way to work around the lack of control
on resource management. But this is a more general problem, for which
the present commit gives a general solution in combination with
caml_result.
@gadmm
Copy link
Contributor Author

gadmm commented Jun 8, 2024

I rebased and applied your suggestions. There was nothing special to be done wrt. the caml_result changes, so let me know if you thought of something specific.

@gasche gasche merged commit 544ce93 into ocaml:trunk Jun 9, 2024
@gadmm gadmm mentioned this pull request Jun 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants