Skip to content

Thread-local storage, take 3#13355

Open
gasche wants to merge 18 commits intoocaml:trunkfrom
gasche:thread-local-storage
Open

Thread-local storage, take 3#13355
gasche wants to merge 18 commits intoocaml:trunkfrom
gasche:thread-local-storage

Conversation

@gasche
Copy link
Copy Markdown
Member

@gasche gasche commented Aug 2, 2024

This PR is the third PR to propose thread-local storage, after

#12719 introduced TLS as a separate module from DLS, with its own implementation. I asked whether we should have TLS separate from DLS, there was a debate, with a consensus moving in the direction of replacing DLS with TLS wholesale.

#12724 replaced DLS with TLS, with mostly the existing implementation unchanged, but extra logic to save/restore TLS state on thread context switch. There was a lot of API discussion and wide-reaching name changes.

In the meantime the DLS implementation has changed quite a bit, due to the move to make it thread-safe (but not thread-local) in #12889. We also got more data on third-party usage of DLS ( see for example my inspection of Domain.at_exit use-cases in #12724 (comment) ).

I have decided to revisit the decision of replacing DLS by TLS. Previously I argued that having just one mechanism (1) is conceptually simpler for users, and (2) means less implementation works for OCaml maintainers; so we should optimize for the majority use-case (threads) and sacrifice the other (domains). In the present PR I try an alternative to alleviate both concerns, which is to expose both mechanisms but with the same interface and the same implementation. I believe that this can provide conceptual consistency, at a manageable cost in implementation complexity.

The present PR is in a functional state, its history is clean, it is ready for review and could be integrated quickly if we like the idea.

What is in this PR

  1. Runtime support for TLS storage, alongside the existing support for DLS storage (most of the code is reused/adapted from the previous PRs).

  2. A new internal stdlib module, camlinternalStorage, that exports both TLS and DLS modules with, basically, the same interface that Domain.DLS has today. Internally this is defined using a functor over the state-access primitives (which differ for DLS and TLS), which is instantiated twice. I massaged the code to ensure that the functor application does not hurt the performance of DLS.get, even on non-flambda compilers.

  3. A new user-exposed stdlib module Thread_local_storage that exports TLS, reused from the previous PRs. DLS remains available through Domain.DLS.

  4. Some tweaks to the "storage" interface that come from review discussions in the previous PRs, in particular the type 'a t rather than 'a key and the use of make instead of new_key. The older names remain available in DLS for compatibility, but are not exposed in TLS.

  5. Simple unit tests for TLS, with and without the threads runtime linked. (But note that TLS is not used in the standard library for now, so we get substantially less test coverage that we had in make dls thread local #12724.)

What is not in this PR

  1. DLS.get relies on a compiler primitive supported by the compiler backend. I have not implemented this for TLS.get, which is just a call to a C function. This is good enough for a first iteration of the PR; eventually we want a primitive for performance, but it makes the change more invasive and can come in a later PR.

  2. I believe that a satisfying solution for TLS should expose a form of Thread.at_exit function, so that we can release thread-local resources. No such function is currently implemented, so the picture is incomplete. Implementing Thread.at_exit in a way that works well with and without the threads library linked is not easy, and better left for another PR. I think that my separate PR move the systhreads C code from otherlibs/systhreads to runtime/ #13348 would help moving in this direction (sketch: define CamlinternalThread that implements at_exit differently whether or not the threads machinery has been initialized, and expose at_exit from Thread_local_storage directly.)

  3. make dls thread local #12724 (necessarily) includes a move from DLS to TLS in the OCaml standard library, in particular for Format buffers and the global Random state. No such change is included in the present PR, in part to keep it smaller and in part due to the lack of Thread.at_exit (see point (2) above).

  4. I remain unsatisfied with the new API. I think that "foo-local storage key" should maybe be called a "foo-local variable" or a "foo-local reference", some conceptual change is missing to make the API more consistent with OCaml users -- but maybe we want to expose or at least document both views to remain familiar for domain experts.

    I propose to leave this outside the scope of the current PR, and instead open a separate discussion Reinventing the DLS API #13356 to hash out API proposals. Otherwise this has a good chance to monopolize discussions and prevent discussing the broader picture (one local storage module, two?).

    If we converge on a satisfying API proposal quickly, I can update the PR to reflect it (I already tried to incorporate the consensus from previous PRs.) Otherwise an interface change can perfectly be done in a later PR. The functorized design makes it easy to evolve DLS and TLS in a consistent way.

What to look at, review and discuss

I believe that the main question is whether we like the two-modules solution, and whether we want to go for it in time for OCaml 5.3 (feature freeze date rumored to be mid-August, so that would require swift consensus-building).

If you want to look at the two user-exposed APIs, they are:

(but: I'll open a separate issue to discuss the details of the API design.)

If you want to look at implementation:

If you want to review the PR in full, I would suggest reviewing commit-by-commit.

@gasche gasche mentioned this pull request Aug 2, 2024
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Aug 2, 2024

I opened a discussion thread at #13356 for the details of the local storage API. Please discuss these there, I can easily reflect whatever changes we agree on in the current PR.

@gasche gasche force-pushed the thread-local-storage branch 2 times, most recently from 8af5438 to 3519056 Compare August 3, 2024 15:27
gasche and others added 9 commits August 3, 2024 17:33
The intent is to have this field saved/restored when thread switching,
so that it really acts as thread-local state as opposed to / in
addition to the current domain-local state.
This very small refactor clarifies the responsibilities of sub-parts
of the DLS implementation. In particular, the handling of
parent-initialized keys can be cleanly separated, moved to the end of
the module.
Suggested-by: Daniel Bünzli <daniel.buenzli@erratique.ch>
@gasche gasche force-pushed the thread-local-storage branch from 3519056 to 8c28595 Compare August 3, 2024 15:34
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Aug 3, 2024

Today I pushed two commits that were arguably missing from the PR (there was a "things missing from this PR" section in the PR description, which is now removed):

  • basic unit tests for TLS, including tests where the thread runtime is not linked (then TLS behaves just like DLS)
  • a little optimization work to ensure that the functor application does not reduce the performance of DLS.get (the only performance-critical operation in my book) compared to the previous non-functorized definition

As far as I am concerned, the PR could be merged in its current state.

@gasche gasche force-pushed the thread-local-storage branch from f64b879 to cea07bc Compare August 3, 2024 15:55
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Oct 28, 2024

We discussed this PR at the maintainer meeting today. There isn't a strong consensus in favor of offering both TLS and DLS in the long run, as is being proposed here (people remain doubtful that DLS has enough real-world use to justify central stdlib support), but there was a consensus on the idea of adding TLS and leaving DLS alone for now, especially if we can share code between them as proposed in this PR.

@ejgallego
Copy link
Copy Markdown

Hi @gasche, this is amazing. Looking at it has been on my TODO list for a while.

I'd love to help move it forward if there's any way I can help.

From a user's perspective, it would be great to have a lower barrier to entry than working directly with this OCaml compiler branch.

For example, say I start experimenting with c-cube/thread-local-storage to adapt my code to the TLS API.

Would it then be possible to switch to your branch more or less transparently once my code is running well?

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 18, 2025

The blocking factor (to me) appears to be to find a maintainer willing to review this change. The maintainers that are typically interested in multicore aspects of the runtime are @kayceesrk and @OlivierNicole. I suppose that @stedolan, @damiendoligez and @NickBarnes would also be fine reviewers but they are typically unavailable.

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Jun 18, 2025

3. A new user-exposed stdlib module Thread_local_storage that exports TLS, reused from the previous PRs.

Why not Thread.TLS ? I find the asymmetry with Domain.DLS rather jarring.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jun 18, 2025

Using a submodule of Thread forces to link the thread runtime. This may not be what library-authors want if they want to write code that is thread-safe, but without deciding for their users whether they will use threads or not. (This being said, one may argue that DLS should be used in this case, with TLS reserved for cases where we know for sure that threads are involved.)

ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 18, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 19, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 19, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- native compiler support has to be reverted back, needs more
  investigation, vm_compute also needs work

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Jun 19, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- native compiler support has to be reverted back, needs more
  investigation, vm_compute also needs work

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
ejgallego added a commit to ejgallego/coq that referenced this pull request Oct 1, 2025
We use the
[thread-local-storage](https://github.com/c-cube/thread-local-storage)
to store Rocq's imperative state on a per-thread basis.

This enables multi-threading / multi-domain programming, and opens up
many possibilities.

Notes:

- this is at RFC / WIP stage and has been done quickly, as to start
  testing and gather feedback

- very careful review and testing is still needed, in particular to
  ensure `CRef` is not in scope where temporary `refs` have been created

- `mutable` fields haven't been handled yet

- parts that may not be multi-thread capable, such at coqtop or stm
  have not been addressed on their use of refs

- moreover there could be places that rely on actually threads sharing
  a memory block :crossing_fingers:

- native compiler support has to be reverted back, needs more
  investigation, vm_compute also needs work

- There are out-of-summary refs that may interact poorly with this,
  c.f. rocq-prover#20328

- This PR actually touches many parts that could see cleanup and would
  make code quality better; we could even start to introduce the
  `CRef` discipline without activating thread local storage

- Note Xavier comment about TLS on:
  ocaml/ocaml#11770 (comment)

  > Also, I'd like us to keep in mind that thread-local state, like
  > domain-local state, is a quick fix; avoiding global state entirely
  > is a better way to go about parallelism and concurrency.

- Also note Gabriel Scherer work to upstream TLS into Ocaml:
  ocaml/ocaml#13355

Thanks to Guillaume Munch-Maccagnoni for help and advice, and to
c-cube, Gabriel, and other folks making this happen in OCaml.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants