Skip to content

Add Sys.runtime_executable and caml_sys_proc_self_exe#13728

Merged
dra27 merged 2 commits intoocaml:trunkfrom
dra27:Sys.interpreter
May 15, 2025
Merged

Add Sys.runtime_executable and caml_sys_proc_self_exe#13728
dra27 merged 2 commits intoocaml:trunkfrom
dra27:Sys.interpreter

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Jan 10, 2025

In bytecode, Sys.executable_name goes to some lengths to refer to the path to whereever the bytecode image came from. However, in both bytecode and native code, on most platforms, the runtime has caml_executable_name which does whatever's the appropriate equivalent of reading /proc/self/exe.

For some additional testing that I wish to add, I need to know whether the runtime implements caml_executable_name or just returns NULL (because it significantly affects the handling of bytecode startup when argv[0] is being altered). One way to do this is to record the fact in Config.has_caml_executable_name, but this involves a lot of tedious plumbing and duplication between configure.ac, utils/config.generated.ml.in and runtime/unix.c, and that's just to support a test harness. Another way is simply to wrap caml_executable_name as a primitive (although, confusingly, we already have caml_sys_executable_name meaning something else). That seems a bit daft, though - caml_executable_name is (with only one exception) always called during startup anyway. The value should not change during execution of a program, but the ability to query the running execcutable's name in theory can disappear (!!). The simplest approach, therefore, seems to be to squirrel it away at the same time as the derived executable name in caml_sys_init which is what the first commit does, exposing the squirreled value via caml_sys_proc_self_exe. Usefully, in native code, that squirreled-away value is in fact the same pointer as exe_name, so it doesn't even cost any space 😁

As part of the original work on Relocatable OCaml, I found it useful to be able to identify which ocamlrun was being executed from within a bytecode program itself, which is where the second commit originates from. Sys.interpreter uses this primitive in order to provide the path to the actually executing program. With this primitive, one then has in make runtop:

OCaml version 5.4.0+dev0-2024-08-25
Enter #help;; for help.

# Sys.executable_name;;
- : string = "./ocaml"
# Sys.runtime_executable;;
- : string = "/home/dra/relocatable/ocaml/boot/ocamlrun"

As ever, completely open to naming suggestions - the primitive goes with the Linux name for the feature, as I figured caml_sys_GetModuleFileName might be less popular and it's worth having a name which is very distinct from argv[0] or caml_sys_executable_name, etc. Sys.runtime might be slightly better than Sys.interpreter as that sounds more consistent in native mode (and we do have Sys.runtime_variant, etc., already).

@dra27 dra27 added the stdlib label Jan 10, 2025
@dbuenzli
Copy link
Copy Markdown
Contributor

dbuenzli commented Jan 10, 2025

I'm not sure I fully followed your explanations :–) But I think I understand what you want.
Let's improve my confidence: in native code you'd always have assert (Sys.interpeter = Sys.executable_name). EDIT: well it's written in the doc string !

Sys.runtime might be slightly better than Sys.interpreter as that sounds more consistent in native mode (and we do have Sys.runtime_variant, etc., already).

Sys.runtime_executable perhaps.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Jan 10, 2025

Your understanding is indeed correct 😊

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Jan 10, 2025

Possibly:

diff --git a/stdlib/sys.mli b/stdlib/sys.mli
index 999fce322d..f50be349e7 100644
--- a/stdlib/sys.mli
+++ b/stdlib/sys.mli
@@ -32,11 +32,12 @@ val executable_name : string
     on the platform and whether the program was compiled to bytecode or a native
     executable. *)

-val interpreter : string
-(** The name of the file containing the interpreter currently running. For
-    native code, this is just {!executable_name}. This name may be absolute or
-    relative to the current directory, depending on the platform (Linux, Windows
-    and macOS should all return absolute paths).
+val runtime_executable : string
+(** The name of the file containing the runtime currently running. For
+    native code, this is always {!executable_name}, but in bytecode it may be
+    ocamlrun, for example. This name may be absolute or relative to the current
+    directory, depending on the platform (Linux, Windows and macOS should all
+    return absolute paths).

     @since 5.4
 *)

?

@dra27 dra27 force-pushed the Sys.interpreter branch 2 times, most recently from 1ba7ca8 to 6f8f3c0 Compare January 10, 2025 15:22
@dbuenzli
Copy link
Copy Markdown
Contributor

Note I have reviewed the changes in the diff and those look ok. But I don't have enough context1 of the system in my head to approve.

Footnotes

  1. I'm still extremely dubious about patch based software evolution.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Jan 12, 2025

I'm still extremely dubious about patch based software evolution.

I don’t follow - is that as a general comment, or something specific to how I’m presenting these PRs?

Copy link
Copy Markdown
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing the value in the runtime system seems harmless enough. I am only wondering if it is a good idea to expose this to user programs in the standard library. Are there any known use-cases for it? For "internal" use maybe the C primitive would suffice otherwise. @dra27: what do you think?

on the platform and whether the program was compiled to bytecode or a native
executable. *)

val runtime_executable : string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but from a naming perspective, it would be better in my opinion to suffix this with _name to emphasize the similarity with executable_name.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find Sys.executable_name quite confusing: I'm never sure if it is a filename or file path. If you want something else here I'd rather suggest Sys.runtime_executable_path or Sys.runtime_executable_filepath.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure either way - part of the problem is that it could be either, although I'd veer towards runtime_executable_path is it's ideally a full path to the file, and only a filename or implicit/relative path in fallback cases?

@dbuenzli
Copy link
Copy Markdown
Contributor

I don’t follow - is that as a general comment, or something specific to how I’m presenting these PRs?

Sorry. No, nothing against you! It's a comment about the way we practice software evolution through diffs reviews in these web UIs. There is a lack of context which makes it difficult to me to really assess the correctness or suitability of a change (In my own projects I usually apply the diffs of PRs in a branch locally and then look around before rebasing on main).

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Jan 13, 2025

I don't mind, @nojb - the primitive is definitely the only part I need, and of course exposing it in the stdlib could be added subsequently. That said, it felt that it fits into the same area of introspection properties as Sys.runtime_variant and Sys.backend_type.

Copy link
Copy Markdown
Contributor

@nojb nojb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Since this touches the standard library, a second official approval is needed.

@nojb nojb changed the title Add Sys.interpreter and caml_sys_proc_self_exe Add Sys.runtime_executable and caml_sys_proc_self_exe Jan 13, 2025
@dra27 dra27 force-pushed the Sys.interpreter branch 2 times, most recently from a630961 to 8f15274 Compare January 16, 2025 21:49
@damiendoligez damiendoligez self-requested a review January 22, 2025 14:13
@gasche
Copy link
Copy Markdown
Member

gasche commented Feb 5, 2025

This was approved, maybe it should also be merged?

@nojb
Copy link
Copy Markdown
Contributor

nojb commented Feb 8, 2025

This was approved, maybe it should also be merged?

I think @damiendoligez is intending to review the PR.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Feb 9, 2025

Indeed, it introduces a stdlib function, so it’s still in need of a second core dev approval

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Apr 14, 2025

Ping @damiendoligez (or another core dev willing to add a second approval)?

caml_executable_name is always called in native startup and for all the
non-default bytecode linking mechanisms. Bytecode startup now always
calls caml_executable_name, and this value is stored along with
exe_name.

caml_sys_proc_self_exe returns this stored value as a string option. It
returns None if caml_executable_name is not implemented on a given
platform.
In native mode, same as Sys.executable_name, in bytecode, the path to
the interpreter executing Sys.executable_name, which may not be the same
from the same file.
@dra27 dra27 force-pushed the Sys.interpreter branch from cba40b3 to 842ae77 Compare May 13, 2025 15:57
@damiendoligez damiendoligez self-assigned this May 14, 2025
Copy link
Copy Markdown
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like the relocatable stuff to go ahead with as few hurdles as possible, so I'm giving a second approval on behalf of my trust in all of (in alphabetical order) @dbuenzli, @dra27 and @nojb who discussed this already and converged to the present state.

@dra27 dra27 merged commit 28f4b40 into ocaml:trunk May 15, 2025
24 checks passed
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented May 15, 2025

Thanks, @gasche!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants