Support BUILD_PATH_PREFIX_MAP in ocamldebug#12085
Support BUILD_PATH_PREFIX_MAP in ocamldebug#12085richardlford wants to merge 1 commit intoocaml:trunkfrom
Conversation
|
There's discussion still in the original issue, so this doesn't necessarily want amending yet, but a few comments:
|
15d18e2 to
7d960f8
Compare
|
As mentioned in #12083 (comment), I've decided to use |
7d960f8 to
47ea3f6
Compare
|
@dra27, @gasche, @dbuenzli: I think I'm done for today. I think the implementation is complete. The things I see remaining are:
I'll resume these efforts tomorrow. If you have any review comments that would be helpful. For example, do you like the idea of a debugger variable to control BUILD_PATH_PREFIX_MAP from within the debugger? I thought it was a nice addition. Any comments on the manual or man page changes? |
|
Isn't it `BUILD_PATH_PREFIX_MAP` that you mean, rather than `BUILD_PATH_PREFIX_PATH`?
|
|
Richard L Ford (2023/03/15 09:36 -0700):
@Octachron maybe? But it's no use to repeatedly mention people IMO.
|
|
Richard L Ford (2023/03/15 15:21 -0700):
@dra27, @gasche, @dbuenzli: I think I'm done for today. I think the implementation is complete. The things I see remaining are:
- Diagnose why a test is failing. It does not fail for me locally.
- Could it be Windows related?
Will have a look into it, today, tomorrow or at the beginning of last
week.
- Write some tests for the test suite. I need to figure out how to do that. I have tested the funcitonality manually and it all seems to be working fine.
I can help with that one. Feel free to contact me on Slack or by email
and we can even organize an audio meeting.
I'll resume these efforts tomorrow. If you have any review comments
that would be helpful. For example, do you like the idea of a debugger
variable to control BUILD_PATH_PREFIX_MAP from within the debugger? I
thought it was a nice addition. Any comments on the manual or man page
changes?
Did you have a look to how other debuggers deal withthis and howthey
give access to the feature? It may be good to take inspiration there.
I think the `ocamldebug` guru is @damiendoligez, who may also have an
opinion.
|
19f5d3c to
e862ee6
Compare
|
I squashed and force-pushed. |
You are right. I hope I have fixed all the places where I had that. |
Sorry, I'm not sure what the etiquette is. If someone has contributed to a PR conversation, do they automatically get notified if there are any changes, without being mentioned again? |
|
Richard L Ford (2023/03/16 01:45 -0700):
> But it's no use to repeatedly mention people IMO.
Sorry, I'm not sure what the etiquette is. If someone has contributed
to a PR conversation, do they automatically get notified if there are
any changes, without being mentioned again?
People can configure whether they get notified for pushes or not.
Independently of that, the way people are notified of comments is up to
them. I think most of the core developers of OCaml are `watching` the
repository, so they see everything, be it trhoug web notificaitons or
through email. Then, as soon has they have been mentionned once on a
PR/issue/discussion, or as soon as they participate, the e-mails they
receive are sent in Cc as opposed to those sent for threads one is not
participating to, which are sent in Bcc.
This happens until threads are closed or people unsubscribe from them.
|
e862ee6 to
1ef3f59
Compare
|
I pushed another commit that completes the work for the PR.
Since this includes a bit of new code, and especially since it includes compiler changes, I'm expecting it will need further review. In the meantime, if that one test is still failing on Windows, I'll look into it. My new tests were based on that test, so I'm quite familiar with it. |
@shindere, I found your helpful ocamltest tutorial. With that and reading the ocamltest sources I was able to write my tests. As mentioned above, I added a new ocamltest action, so you may be interested in reviewing it (or suggest a way I could have done without it). |
There was a problem hiding this comment.
Generally the work here is impressive, you are touching many different parts of the tools with apparent ease and nice documentation. I have a few concerns with the current proposal however:
- There are too many things mixed together that need to be looked at by different people. Your second commit (1) adds an entirely new mode of use of BUILD_PATH_PREFIX_MAP in the Location API and (2) uses it in various places of the compiler and (3) implements a new ocamltest feature (4) uses them in new tests. This is way too much stuff in a single commit and in fact arguably it would be better in separate PRs. My proposal would be to:
- agree on the support we want for BUILD_PATH_PREFIX_MAP and merge that, without tests for now
- send the
expandocamltest action as a follow-up PR, along with ocamldebug tests using it servince as a justification for the change
-
I am not sure about the design using
:to cram multiple potential output paths for a single input path. I think that:in the format is meant to allow users to specify paths themselves containing the:character, for example on Windows. We discussed in the past having an option to "invert" the semantics of the BUILD_PATH_PREFIX_MAP file, and I would note that this suffices to specify one-to-many rewrite relations: if you write a map that maps bothfooandbarto/workspace_root, then playing the "reverse" mapping will naturally return bothfoo/aandbar/afor the input/workspace_root/a. (But maybe there are other approaches.) -
I don't like your approach regarding
Location.absolute_path_always. When we introduced BUILD_PATH_PREFIX_MAP we discussed whether to rewrite only relative paths or all kind of paths. I think I made the choice to rewrite only relative paths to make the change less invasive / more controlled. Maybe that was the wrong choice, as @dbuenzli apparently belivies, and we could revisit it and discuss applying the rewriting on absolute paths as well. However, your current suggestion feels like a hack to avoid having this discussion: you add both modes (rewrite only relative paths, rewrite all paths), and we start randomly using one or the other inside the codebase depending on what tests tell us that we should probably do. This increases the total complexity of the system, I would like to avoid this extra complexity.
| else if base = parent_dir_name then dirname (aux dir) | ||
| else concat (aux dir) base | ||
| in | ||
| aux s |
There was a problem hiding this comment.
I don't like the amount of code duplication with the function absolute_path above, and I am not fond of the name either -- _always suggests that we always do something, but what?
If we need to have this feature of deciding locally to rewrite only relative paths or also absolute paths, I would propose the following interface:
val absolute_path : rewrite:[ `Absolute | `All ] -> string -> string
It's not about a wrong choice. It's just that relative paths are not that much of a problem for reproducible builds if you build from a consistent location and only use relative paths (which I don't consider to be a good idea from a build usability point of view). E.g. if you have your It's absolute paths that are problematic and that's precisely the ones that OCaml doesn't support… |
|
@gasche: I agree. Here are some comments on your review.
I agree. I was just trying to get something complete but am happy to
Agreed.
Let me think. Let's assume first of all, as we previously discussed, Currently, at least, Dune maps the root of the build tree Let consider two possible source layouts. Layout I: All libraries are in directories directly below the top, but they may
and the file part of debug events would be
And similarly for pkgb, pkgc, etc. Let's suppose that our current In this case, no rewrite of the plain
We would need one such mapping for each library (multiple per package In layout I, there is no advantage to being able to map to multiple Layout II: In this layout, below the top level of the workspace there is a The compilaton of file
and the file part of debug events would be
And similarly for pkgb, pkgc, etc. As before, let's suppose that our Unlike Layout I, in this case, rewriting the plain
But we will want to give priority to our source directory, then to our But it only works with Layout II. Otherwise, you have to resort to However, as you mentioned above (and I had forgotten), Windows
The issue is whether you want any paths in your output that are If Layout II were used, then both your local build root as well as the Of course, the compiler can be willing to map absolute paths other In the compiler changes I was making, I wanted to be careful not do |
|
A few tidbits from the bazel guy. There is no Similarly for inputs. Every input must be registered with Bazel, which does a bunch of symlinking so that when an action runs its inputs are relative to a bazel-defined dir. Which means incidentally that if you fail to register an input file, it will be ignored even if it exists where you expect it. Example just to illustrate the kind of path it creates: And that path can vary even for files in the same directory, since you can dynamically change the build configuration, and Bazel will put each build config in its own space. I guess the lesson here is: never use absolute paths unless you really really, no really, know what you are doing. I'm not really sure what this implies for this proposal since I don't know much about the debugger. Maybe BUILD_PATH_PREFIX_MAP would suffice. What I do know is that Dune is not the One True Build system and should not be treated as such (monoculture bad!). Something else I know: this is not the only place where tooling puts src paths into outputs in a bad way. Some ppxes and some tests in the ocaml repo do this, resulting in non-replicable builds. The problem is that the tests/ppxes as run by dune put the filename into the output, I think (recalling) because dune does a chdir before running stuff. But with bazel the outputs contain a full path (I don't remember if it is absolute or relative). So anything involving diffs breaks. I've been able to work around this for some ppxes by passing |
|
And here's an impertinent question. Also embarrassing since I should know the answer but I don't: why are we even discussing running the debugger against installed libs? I've noticed the profligate used of |
Good question. I can perhaps think of a couple of reasons:
Regarding getting to your "own code", the problem I've encountered is that ocamldebug's facility for setting a breakpoint at a function rarely works, because it can only be executed in a context where the target function is in-scope, i.e. needed for some purpose. And that rarely happens. Fortunately, I have in my pipeline a fix for that. There is enough information in the debug information to know before execution starts where all of the functions are. So I soon plan to submit a PR for ocamldebug that will let you set a function breakpoint whenever you want, and particularly before starting the program. |
|
Regarding multiple target paths, I've had some enlightenment. First of all, it should have been obvious that ':' would not work on Also, I was incorrect in saying that multiple targets would not The path derived from the directories in the debug information is For brevity, let's abbreviate:
Layout I: All libraries are in directories directly below
and the file part of debug events would be
and
In this case, no single rewrite of the plain
The above mappings will find all the appropriate sources and binaries Layout II: In this layout, below the The compilaton of file
and the file part of debug events would be
Now the following multiple rewrites of
In conclusion, multiple mappings are useful. They are simplest with |
|
@shindere, I need some help with my tests on Windows. To set the value of BUILD_PATH_PREFIX_MAP I was currently using some lines like The problem is that on Windows, One idea is to write a script that takes its argument and a target file, and then does this encoding and writes the result to the target file. But how would I transfer the value from the file back into a variable? Or maybe enhance the current "script" action, so if an "output_variable" parameter was set to the name of a variable, then the output from stdout would be captured and stored in that variable? Or I've thought it might be handy if ocamltest had a concept of built-in functions. So maybe something like: where "encode" would be a built-in function to do the encoding. The variable registry could be enhanced to allow marking a variable as a function and attaching an ocaml function to it. Or alternatively something like using shell-like back-tick notation to run a script and return the result. I would also like to be able to run the CI environment locally so I can catch these problems before pushing changes. Also, what is the Windows environment like. I see that tests are using the Thanks for your help. |
d307d1d to
6c0ba9d
Compare
|
I have made more changes, but also squash everything into one. As @gasche suggested, this will be split up into multiple PRs (at least 2), but for now, I'm just trying to get a clean test run (on Windows). When that is clean, I will work on splitting it up. The changes since last time:
|
|
I pushed another commit. I decided the problem I was addressing for the debugger I captured the current spec on my github account, then made
|
aa13af6 to
23db1e3
Compare
Note this is a work in progress. The changes will be split into multiple PRs. In order to produce reproducible builds (independent of the location of the build), starting from Dune 3.0, Dune has mapped references to the workspace directory to “/workspace_root”. This change enables ocamldebug to recover the locations of the source code. The ocamldebug user must set the BUILD_PATH_PREFIX_MAP environment variable to effect the mapping. If we only want to map "/workspace_root" to a single directory, say mydir, we would do: export BUILD_PATH_PREFIX_MAP="mydir=/workspace_root" If we want to map to multiple directories, we must separate them with ';'. So to map to mydir1 and mydir2, the setting would be: export BUILD_PATH_PREFIX_MAP="mydir1;mydir2=/workspace_root" Then if ocamldebug has a path like "/workspace_root/stuff", this is converted to the list ["mydir1/stuff" "mydir2/stuff"] Add "mapping" debugger variable to control BUILD_PATH_PREFIX_MAP. In order to allow the BUILD_PATH_PREFIX_MAP environment variable to be set from within ocamldebug, a new debugger variable, "mapping" has been added. It works with the usual set and show commands. When set, the value is put into BUILD_PATH_PREFIX_MAP in ocamldebug's environment, just as if it had been set before entering ocamldebug. And showing it give the current value, including the value on entry, if set. This also makes it possible to set it from .ocamldebug files. The manual and man pages have been updated to describe the change. 1. I added four tests for the ocamldebug support for BUILD_PATH_PREFIX_MAP. One test shows that if debug paths are sanitized by the compiler, then the debugger is unable to find the sources. The other three tests show three ways to provide the BUILD_PATH_PREFIX_MAP to ocamldebug: a. Setting environment variable externally. b. Setting the "mapping" ocamldebug variable in an input script. c. Setting the "mapping" ocamldebug variable in a ".ocamldebug" file. 2. To write the tests I needed a way to have script or ".ocamldebug" files that had ocamldebug variables expanded. I did not see a way to do that, so I added a new "expand" action to ocamltest. It is mostly like the "copy" action, but the source file is read a line at a time and the lines are expanded. It does not support source directories, but the destination can be a directory. 3. I used a dumper to look at the directories in the debug information and noticed that there were still some unsanitized directories. I tracked the source of these in the compiler and modified so all debug directories are sanitized. More ocamltest enhancements, add/fix tests 1. I enhanced ocamltest to have a facility for making builtin functions. These are like ocamltest variables but have a function attached to them, and when the variable is expanded, the arguments are first expanded and then the function is called and its result is returned. 2. Added a new "dumpenv_expanded" action which not only shows the value of each variable, but also what the they expand to. When variables are originally assigned, their RHS is not expanded, but only later when the variable itself is looked up. 3. Two builtin functions were defined: - bppm_decode does BUILD_PATH_PREFIX_MAP decoding - bppm_encode does BUILD_PATH_PREFIX_MAP encoding 4. Tests of the new builtin functions were added. 5. The debugger prefix mapping tests were modified to use bppm_encode. They, in fact, where the motivation for the ocamltest changes. Use load/install printer trick to add ocamldebug unit test When a printer is loaded and installed into the debugger, it has access to the internals of the debugger. I factored out the function that does the BUILD_PATH_PREFIX_MAP processing in ocamldebug and was able to unit test it with code in the fake printer. Modify for revised BUILT_PATH_PREFIX_MAP spec I decided the problem I was addressing for the debugger was more general and that the BUILT_PATH_PREFIX_MAP spec should be generalized to handle it. See richardlford/build-path-prefix-map-spec#1. 1. Modified BUILT_PATH_PREFIX_MAP handling accordingly 2. Used the new facilities in ocamldebug 3. Updated one test for the new spec. 4. Modify encoding and decoding functions in ocamltest. Avoid adding dir to Load_path if already their, some test changes Sanitize paths in debug events Previously we were sanitizing the debug directories but were assuming the paths in debug events were relative. But if the user passes an absolute path to the compiler, it was putting absolute paths in the debug events. Alway rewrite in Location.absolute_path 1. Modified Location.absolute_path to do mapping for BUILD_PATH_PREFIX_MAP regardless of whether the path is relative or absolute. Previously it only did it for relative paths. 2. Because of ocaml#1, deleted Location.absolute_path_always. 3. In bytelink.ml, move make_absolute locally within link_bytecode and do not do BUILD_PATH_PREFIX_MAP rewriting. In this case the path is what goes in the shebang, and I do not see how we would want rewriting in that case, unless there was a later processor to rewrite the shebangs. 4. Update man page and manual for the revised BUILD_PATH_PREFIX_MAP spec (approval pending). Prepare to be able to do Dune tests 1. Added a action helper for testing if a program is available in PATH. 2. Use the helper to make an action, has_dune, that detects whether dune is available. Fixes ocaml#12083
b9a7032 to
d8c55c9
Compare
|
Thanks to all who have given feedback so far. I think the functionality is now correct, so I will proceed to break it into the following pieces:
I'm going to leave this PR in its current state until those are completed so that reviewers can see all three if desired. |
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only covers half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. These changes were originally being made for the benefit of the debugger, but I realized they are more generally applicable and that a generalization of the spec was desirable. I have a proposed revision of that specification. I have capture the original specification on github, and then created a pull requests with my proposed changes. See richardlford/build-path-prefix-map-spec#1 This PR implements the revised specification. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. See the revised document for motivation. 1. utils/build_path_prefix_map.{ml,mli} The essential change to the spec is that left-hand side of a mapping pair (formerly the target) now may have a list of targets, separated by ';'. Because ';' is now used as a delimiter, it is escaped when a path is encoded (to "%,"). Because of the encoding, there is no restriction on characters in a path. A list of targets that may be returned is called a "search list". The API is extended with functions that return search lists. During the build phase we expect that the maps will only have a single target. An open issue is what to do if there is more than one target during the build phase. Currently we ignore the mapping. 2. parsing/location.{ml, mli} This has convenience functions for using the above mapping API. It reads and caches the environment variable so the end-user does not need to. a. absolute_path was modified so that BUILD_PATH_PREFIX_MAP rewriting is done for both absolute and relative paths. Relative paths are made absolute by appending to the cwd. Previously only relative paths were rewritten. One discovery during testing was that if the compiler is given an absolute path as the input source, the debug event information had that absolute path (see below). b. rewrite_to_search_list New function which returns search list from the mapping. c. search_list_find New function that uses rewrite_to_search_list to get a list of paths and then looks for the first one that exists. 3. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 4. bytecomp/bytelink.ml Never rewrite when producing the path for a shebang. The value written is from the user "-use-runtime" option, and I do not believe that should be rewritten. 5. utils/load_path.ml Modified so that a directory is not added to the load path if it is already in the load path. I was concerned that failing to do this might not preserve the desired order. 6. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the BUILD_PATH_PREFIX_MAP environment variable. So "set mapping value" will set BUILD_PATH_PREFIX_MAP, and "show mapping" will show the value of BUILD_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 7. debugger/program_management.ml Reverse the list of directories so the ones with highest priority are first. 8. debugger/source.ml In source_of_module, use the Location.search_list_find API. 9. debugger/symbols.{ml, mli} a. Take care to preserve the order of directories, so the highest priority will come first. b. Expose bppm_expand_path and get_load_path so they can be called from a fake printer which, when loaded and installed in a test, can do unit testing of ocamldebug. 10. utils/misc.ml When printing mapping, take into account multiple targets. 11. Man page and manual updated. Fixes ocaml#12083
This is part 2 of a larger PR, ocaml#12085, which includes a part 1 (compiler and debugger changes), this part, and a third part which is tests of part 1 that make use of (and hence motivate and test) these changes. Please see that PR for example uses of these changes. 1. Add the ability to have scripts or files with ocamltest variables references expanded. Added a new "expand" action to ocamltest. It is mostly like the "copy" action, but the source file is read a line at a time and the lines are expanded. It does not support source directories, but the destination can be a directory. 2. Enhanced ocamltest to have a facility for making builtin functions. These are like ocamltest variables but have a function attached to them, and when the variable is expanded, the arguments are first expanded and then the function is called and its result is returned. Currently they only take one parameter, but it should not be too hard to add the ability to have multiply arguments. 3. Added a new "dumpenv_expanded" action which not only shows the value of each variable, but also what they expand to. When variables are originally assigned, their RHS is not expanded, but only later when the variable itself is looked up. 4. Two builtin functions were defined: - bppm_decode does BUILD_PATH_PREFIX_MAP decoding - bppm_encode does BUILD_PATH_PREFIX_MAP encoding See https://reproducible-builds.org/specs/build-path-prefix-map/ and richardlford/build-path-prefix-map-spec#1. 5. Prepare to be able to do Dune tests: 5a. Added a action helper for testing if a program is available in PATH. 5b. Use the helper to make an action, has_dune, that detects whether dune is available.
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. See the revised document for motivation. 1. parsing/location.{ml, mli} This has convenience functions for using the above mapping API. It reads and caches the environment variable so the end-user does not need to. Location.absolute_path was modified so that BUILD_PATH_PREFIX_MAP rewriting is done for both absolute and relative paths. Relative paths are made absolute by appending to the cwd. Previously only relative paths were rewritten. One discovery during testing was that if the compiler is given an absolute path as the input source, the debug event information had that absolute path (see below). 2. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 3. bytecomp/bytelink.ml Rewrite when producing the path for a shebang. The value written is from the user "-use-runtime" option. The one setting BUILD_PATH_PREFIX_MAP will have control whether this does anything or not. 4. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the BUILD_PATH_PREFIX_MAP environment variable. So "set mapping value" will set BUILD_PATH_PREFIX_MAP, and "show mapping" will show the value of BUILD_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 5. debugger/source.ml In source_of_module, use the Location.rewrite_absolute_path API. 6. debugger/symbols.{ml, mli} a. Avoid adding directories redundantly b. Expose bppm_expand_path and get_load_path so they can be called from a fake printer which, when loaded and installed in a test, can do unit testing of ocamldebug. 7. Man page and manual updated. Fixes ocaml#12083
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. See the revised document for motivation. 1. parsing/location.{ml, mli} This has convenience functions for using the above mapping API. It reads and caches the environment variable so the end-user does not need to. Location.absolute_path was modified so that BUILD_PATH_PREFIX_MAP rewriting is done for both absolute and relative paths. Relative paths are made absolute by appending to the cwd. Previously only relative paths were rewritten. One discovery during testing was that if the compiler is given an absolute path as the input source, the debug event information had that absolute path (see below). 2. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 3. bytecomp/bytelink.ml Rewrite when producing the path for a shebang. The value written is from the user "-use-runtime" option. The one setting BUILD_PATH_PREFIX_MAP will have control whether this does anything or not. 4. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the BUILD_PATH_PREFIX_MAP environment variable. So "set mapping value" will set BUILD_PATH_PREFIX_MAP, and "show mapping" will show the value of BUILD_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 5. debugger/source.ml In source_of_module, use the Location.rewrite_absolute_path API. 6. debugger/symbols.{ml, mli} a. Avoid adding directories redundantly b. Expose bppm_expand_path and get_load_path so they can be called from a fake printer which, when loaded and installed in a test, can do unit testing of ocamldebug. 7. Man page and manual updated. Fixes ocaml#12083
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. utils/build_path_prefix_map.{ml,mli} Added these functions to aid in use of the mapping during the deployment phase. They allow to skip mapping entries that are not applicable (find_rewrite). find_rewrite is then used to implement rewrite_exists and matching_dirs. val find_rewrite : map -> (path -> bool) -> path -> path option (** [rewrite_opt map pred path] tries to find a source in [map] that is a prefix of the input [path] that produces a result that satisfies predicate [pred]. If it succeeds, it replaces this prefix with the corresponding target. If it fails, it just returns [None]. *) val rewrite_exists : map -> path -> path option (** [rewrite_exists map path] tries to find a source in [map] that maps to a result that exists in the file system. If so, returns Some result. Otherwise, is there was at least one source that was a prefix of path, it is assumed that the map is deficient and Not_found is raised. If no source prefixes were found, None is returned. *) val matching_dirs : map -> path -> path list (** [matching_dirs map absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in [map]. The list [dirs] will be in priority order (head as highest priority). If there is success, returns [dirs]. If no source in the map was a prefix of [absdir], returns [[]]. If at least one source in the map was a prefix of [absdir], but none of the mappings were existing directories, raise Not_found. *) 2. parsing/location.{ml, mli} This has convenience functions for using the above mapping API. It reads and caches the environment variable so the end-user does not need to. Location.absolute_path was modified so that BUILD_PATH_PREFIX_MAP rewriting is done for both absolute and relative paths. Relative paths are made absolute by appending to the cwd. Previously only relative paths were rewritten. One discovery during testing was that if the compiler is given an absolute path as the input source, the debug event information had that absolute path (see below). New functions are: val rewrite_exists: string -> string option option (** [rewrite_exists path] uses a BUILD_PATH_PREFIX_MAP mapping (https://reproducible-builds.org/specs/build-path-prefix-map/) and tries to find a source in mapping that maps to a result that exists in the file system. There are the following return values: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some None, no source prefixes of [path] in the mapping were found, - Some (Some target), means target is the first file (in priority order) that [path] mapped to that exists in the file system. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of them existed in the file system. *) val matching_dirs: string -> string list option (** [matching_dirs absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in the BUILD_PATH_PREFIX_MAP environment variable. The list [dirs] will be in priority order (head as highest priority). The possible results are: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some [], no source prefixes in the mapping were found, - Some dirs, means dirs are the directories found - Not_found raised, means some source prefixes in the map were found that matched [path], but none of mapping results were existing directories. See the BUILD_PATH_PREFIX_MAP spec at (https://reproducible-builds.org/specs/build-path-prefix-map/) *) 3. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 4. bytecomp/bytelink.ml Rewrite when producing the path for a shebang. The value written is from the user "-use-runtime" option. The one setting BUILD_PATH_PREFIX_MAP will have control whether this does anything or not. 5. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the BUILD_PATH_PREFIX_MAP environment variable. So "set mapping value" will set BUILD_PATH_PREFIX_MAP, and "show mapping" will show the value of BUILD_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 6. debugger/source.ml In source_of_module, use the Location.rewrite_exists. 6. debugger/symbols.{ml, mli} a. Avoid adding directories redundantly b. Expose bppm_expand_path and get_load_path so they can be called from a fake printer which, when loaded and installed in a test, can do unit testing of ocamldebug. c. Use Location.matching_dirs to expand directories. 7. Man page and manual updated. Fixes ocaml#12083
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. utils/build_path_prefix_map.{ml,mli} Added these functions to aid in use of the mapping during the deployment phase. They allow to skip mapping entries that are not applicable (find_rewrite). find_rewrite is then used to implement rewrite_exists and matching_dirs. val find_rewrite : map -> (path -> bool) -> path -> path option (** [rewrite_opt map pred path] tries to find a source in [map] that is a prefix of the input [path] that produces a result that satisfies predicate [pred]. If it succeeds, it replaces this prefix with the corresponding target. If it fails, it just returns [None]. *) val rewrite_exists : map -> path -> path option (** [rewrite_exists map path] tries to find a source in [map] that maps to a result that exists in the file system. If so, returns Some result. Otherwise, is there was at least one source that was a prefix of path, it is assumed that the map is deficient and Not_found is raised. If no source prefixes were found, None is returned. *) val matching_dirs : map -> path -> path list (** [matching_dirs map absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in [map]. The list [dirs] will be in priority order (head as highest priority). If there is success, returns [dirs]. If no source in the map was a prefix of [absdir], returns [[]]. If at least one source in the map was a prefix of [absdir], but none of the mappings were existing directories, raise Not_found. *) 2. parsing/location.{ml, mli} This has convenience functions for using the above mapping API. It reads and caches the environment variable so the end-user does not need to. Location.absolute_path was modified so that BUILD_PATH_PREFIX_MAP rewriting is done for both absolute and relative paths. Relative paths are made absolute by appending to the cwd. Previously only relative paths were rewritten. One discovery during testing was that if the compiler is given an absolute path as the input source, the debug event information had that absolute path (see below). New functions are: val rewrite_exists: string -> string option option (** [rewrite_exists path] uses a BUILD_PATH_PREFIX_MAP mapping (https://reproducible-builds.org/specs/build-path-prefix-map/) and tries to find a source in mapping that maps to a result that exists in the file system. There are the following return values: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some None, no source prefixes of [path] in the mapping were found, - Some (Some target), means target is the first file (in priority order) that [path] mapped to that exists in the file system. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of them existed in the file system. *) val matching_dirs: string -> string list option (** [matching_dirs absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in the BUILD_PATH_PREFIX_MAP environment variable. The list [dirs] will be in priority order (head as highest priority). The possible results are: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some [], no source prefixes in the mapping were found, - Some dirs, means dirs are the directories found - Not_found raised, means some source prefixes in the map were found that matched [path], but none of mapping results were existing directories. See the BUILD_PATH_PREFIX_MAP spec at (https://reproducible-builds.org/specs/build-path-prefix-map/) *) 3. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 4. bytecomp/bytelink.ml Rewrite when producing the path for a shebang. The value written is from the user "-use-runtime" option. The one setting BUILD_PATH_PREFIX_MAP will have control whether this does anything or not. 5. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the BUILD_PATH_PREFIX_MAP environment variable. So "set mapping value" will set BUILD_PATH_PREFIX_MAP, and "show mapping" will show the value of BUILD_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 6. debugger/source.ml In source_of_module, use the Location.rewrite_exists. 6. debugger/symbols.{ml, mli} a. Avoid adding directories redundantly b. Expose bppm_expand_path and get_load_path so they can be called from a fake printer which, when loaded and installed in a test, can do unit testing of ocamldebug. c. Use Location.matching_dirs to expand directories. 7. Man page and manual updated. Fixes ocaml#12083
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. utils/build_path_prefix_map.{ml,mli} Added these functions to aid in use of the mapping during the deployment phase. They allow to skip mapping entries that are not applicable (find_rewrite). find_rewrite is then used to implement rewrite_exists and matching_dirs. val find_rewrite : map -> (path -> bool) -> path -> path option (** [rewrite_opt map pred path] tries to find a source in [map] that is a prefix of the input [path] that produces a result that satisfies predicate [pred]. If it succeeds, it replaces this prefix with the corresponding target. If it fails, it just returns [None]. *) val rewrite_exists : map -> path -> path option (** [rewrite_exists map path] tries to find a source in [map] that maps to a result that exists in the file system. If so, returns Some result. Otherwise, is there was at least one source that was a prefix of path, it is assumed that the map is deficient and Not_found is raised. If no source prefixes were found, None is returned. *) val matching_dirs : map -> path -> path list (** [matching_dirs map absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in [map]. The list [dirs] will be in priority order (head as highest priority). If there is success, returns [dirs]. If no source in the map was a prefix of [absdir], returns [[]]. If at least one source in the map was a prefix of [absdir], but none of the mappings were existing directories, raise Not_found. *) 2. parsing/location.{ml, mli} This has convenience functions for using the above mapping API. It reads and caches the environment variable so the end-user does not need to. Location.absolute_path was modified so that BUILD_PATH_PREFIX_MAP rewriting is done for both absolute and relative paths. Relative paths are made absolute by appending to the cwd. Previously only relative paths were rewritten. One discovery during testing was that if the compiler is given an absolute path as the input source, the debug event information had that absolute path (see below). New functions are: val rewrite_exists: string -> string option option (** [rewrite_exists path] uses a BUILD_PATH_PREFIX_MAP mapping (https://reproducible-builds.org/specs/build-path-prefix-map/) and tries to find a source in mapping that maps to a result that exists in the file system. There are the following return values: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some None, no source prefixes of [path] in the mapping were found, - Some (Some target), means target is the first file (in priority order) that [path] mapped to that exists in the file system. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of them existed in the file system. *) val matching_dirs: string -> string list option (** [matching_dirs absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in the BUILD_PATH_PREFIX_MAP environment variable. The list [dirs] will be in priority order (head as highest priority). The possible results are: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some [], no source prefixes in the mapping were found, - Some dirs, means dirs are the directories found - Not_found raised, means some source prefixes in the map were found that matched [path], but none of mapping results were existing directories. See the BUILD_PATH_PREFIX_MAP spec at (https://reproducible-builds.org/specs/build-path-prefix-map/) *) 3. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 4. bytecomp/bytelink.ml Rewrite when producing the path for a shebang. The value written is from the user "-use-runtime" option. The one setting BUILD_PATH_PREFIX_MAP will have control whether this does anything or not. 5. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the BUILD_PATH_PREFIX_MAP environment variable. So "set mapping value" will set BUILD_PATH_PREFIX_MAP, and "show mapping" will show the value of BUILD_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 6. debugger/source.ml In source_of_module, use the Location.rewrite_exists. 6. debugger/symbols.{ml, mli} a. Avoid adding directories redundantly b. Expose bppm_expand_path and get_load_path so they can be called from a fake printer which, when loaded and installed in a test, can do unit testing of ocamldebug. c. Use Location.matching_dirs to expand directories. 7. Man page and manual updated. Fixes ocaml#12083
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. This PR generalized the functions in Build_path_prefix_map and Location to facility the inverse map. Also, this is part of a larger PR, ocaml#12126, which itself was part of ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. utils/build_path_prefix_map.{ml,mli} The API is now: val rewrite_first : map -> path -> path option (** [rewrite_first map path] tries to find a source in [map] that is a prefix of the input [path]. If it succeeds, it replaces this prefix with the corresponding target. If it fails, it just returns [None]. *) val rewrite_all : map -> path -> path list (** [rewrite_all map path] finds all sources in [map] that are a prefix of the input [path]. For each matching source, in priority order, it replaces this prefix with the corresponding target and adds the result to the returned list. If there are no matches, it just returns [[]]. *) val rewrite : map -> path -> path 2. parsing/location.{ml,mli} Use the new API and renamed the rewriting functions. val rewrite_find_first_existing: string -> string option (** [rewrite_find_first_existing path] uses a BUILD_PATH_PREFIX_MAP mapping (https://reproducible-builds.org/specs/build-path-prefix-map/) and tries to find a source in mapping that maps to a result that exists in the file system. There are the following return values: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some None, no source prefixes of [path] in the mapping were found, - Some (Some target), means target is the first file (in priority order) that [path] mapped to that exists in the file system. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of them existed in the file system. *) val rewrite_find_all_existing_dirs: string -> string list (** [rewrite_find_all_existing_dirs absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in the BUILD_PATH_PREFIX_MAP environment variable, if any. The list [dirs] will be in priority order (head as highest priority). The possible results are: - [], means BUILD_PATH_PREFIX_MAP is not set, or if set, then there were no matching prefixes of path. - Some dirs, means dirs are the directories found. A possibility is that there was no mapping, but the input path is an existing directory and the result if [[path]]. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of mapping results were existing directories (possibly due to misconfiguration) See the BUILD_PATH_PREFIX_MAP spec at (https://reproducible-builds.org/specs/build-path-prefix-map/) *) In addition to those new APIs, Location.absolute_path was modified to do rewriting on absolute paths (in addition to relative paths which it was already doing). At the present time their is no code depending on these APIs. They are used in the other parts of the PRs mentioned.
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. This PR generalized the functions in Build_path_prefix_map and Location to facility the inverse map. Also, this is part of a larger PR, ocaml#12126, which itself was part of ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. utils/build_path_prefix_map.{ml,mli} The API is now: val rewrite_first : map -> path -> path option (** [rewrite_first map path] tries to find a source in [map] that is a prefix of the input [path]. If it succeeds, it replaces this prefix with the corresponding target. If it fails, it just returns [None]. *) val rewrite_all : map -> path -> path list (** [rewrite_all map path] finds all sources in [map] that are a prefix of the input [path]. For each matching source, in priority order, it replaces this prefix with the corresponding target and adds the result to the returned list. If there are no matches, it just returns [[]]. *) val rewrite : map -> path -> path 2. parsing/location.{ml,mli} Use the new API and renamed the rewriting functions. val rewrite_find_first_existing: string -> string option (** [rewrite_find_first_existing path] uses a BUILD_PATH_PREFIX_MAP mapping (https://reproducible-builds.org/specs/build-path-prefix-map/) and tries to find a source in mapping that maps to a result that exists in the file system. There are the following return values: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some None, no source prefixes of [path] in the mapping were found, - Some (Some target), means target is the first file (in priority order) that [path] mapped to that exists in the file system. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of them existed in the file system. *) val rewrite_find_all_existing_dirs: string -> string list (** [rewrite_find_all_existing_dirs absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in the BUILD_PATH_PREFIX_MAP environment variable, if any. The list [dirs] will be in priority order (head as highest priority). The possible results are: - [], means BUILD_PATH_PREFIX_MAP is not set, or if set, then there were no matching prefixes of path. - Some dirs, means dirs are the directories found. A possibility is that there was no mapping, but the input path is an existing directory and the result if [[path]]. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of mapping results were existing directories (possibly due to misconfiguration) See the BUILD_PATH_PREFIX_MAP spec at (https://reproducible-builds.org/specs/build-path-prefix-map/) *) In addition to those new APIs, Location.absolute_path was modified to do rewriting on absolute paths (in addition to relative paths which it was already doing). At the present time their is no code depending on these new APIs, so the only functional change is the change to Location.absolute_path They new API functions are used in the other parts of the PRs mentioned.
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. This PR generalized the functions in Build_path_prefix_map and Location to facility the inverse map. Also, this is part of a larger PR, ocaml#12126, which itself was part of ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. utils/build_path_prefix_map.{ml,mli} The API is now: val rewrite_first : map -> path -> path option (** [rewrite_first map path] tries to find a source in [map] that is a prefix of the input [path]. If it succeeds, it replaces this prefix with the corresponding target. If it fails, it just returns [None]. *) val rewrite_all : map -> path -> path list (** [rewrite_all map path] finds all sources in [map] that are a prefix of the input [path]. For each matching source, in priority order, it replaces this prefix with the corresponding target and adds the result to the returned list. If there are no matches, it just returns [[]]. *) val rewrite : map -> path -> path 2. parsing/location.{ml,mli} Use the new API and renamed the rewriting functions. val rewrite_find_first_existing: string -> string option (** [rewrite_find_first_existing path] uses a BUILD_PATH_PREFIX_MAP mapping (https://reproducible-builds.org/specs/build-path-prefix-map/) and tries to find a source in mapping that maps to a result that exists in the file system. There are the following return values: - None, means BUILD_PATH_PREFIX_MAP is not set. - Some None, no source prefixes of [path] in the mapping were found, - Some (Some target), means target is the first file (in priority order) that [path] mapped to that exists in the file system. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of them existed in the file system. *) val rewrite_find_all_existing_dirs: string -> string list (** [rewrite_find_all_existing_dirs absdir] accumulates a list of existing directories, [dirs], that are the result of mapping abstract directory, [absdir], over all the mapping pairs in the BUILD_PATH_PREFIX_MAP environment variable, if any. The list [dirs] will be in priority order (head as highest priority). The possible results are: - [], means BUILD_PATH_PREFIX_MAP is not set, or if set, then there were no matching prefixes of path. - Some dirs, means dirs are the directories found. A possibility is that there was no mapping, but the input path is an existing directory and the result if [[path]]. - Not_found raised, means some source prefixes in the map were found that matched [path], but none of mapping results were existing directories (possibly due to misconfiguration) See the BUILD_PATH_PREFIX_MAP spec at (https://reproducible-builds.org/specs/build-path-prefix-map/) *) In addition to those new APIs, Location.absolute_path was modified to do rewriting on absolute paths (in addition to relative paths which it was already doing). At the present time their is no code depending on these new APIs, so the only functional change is the change to Location.absolute_path They new API functions are used in the other parts of the PRs mentioned.
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase. Then the BUILD_PATH_PREFIX_MAP must be setup with the logical inverse of the mapping used during the build phase. This PR generalized the functions in Build_path_prefix_map and Location to facility the inverse map. Also, this is part of a larger PR, ocaml#12126, which itself was part of ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. For the new functions, see the '.mli' file for details. 1. utils/build_path_prefix_map.{ml,mli} The new api functions are: val rewrite_first : map -> path -> path option val rewrite_all : map -> path -> path list 2. parsing/location.{ml,mli} The new api functions are: val rewrite_find_first_existing: string -> string option val rewrite_find_all_existing_dirs: string -> string list In addition to those new APIs, Location.absolute_path was modified to do rewriting on absolute paths (in addition to relative paths which it was already doing). At the present time their is no code depending on these new APIs, so the only functional change is the change to Location.absolute_path The new API functions are used in the other parts of the PRs mentioned.
This is part 2 of a larger PR, ocaml#12085, which includes a part 1 (compiler and debugger changes), this part, and a third part which is tests of part 1 that make use of (and hence motivate and test) these changes. Please see that PR for example uses of these changes. 1. Add the ability to have scripts or files with ocamltest variables references expanded. Added a new "expand" action to ocamltest. It is mostly like the "copy" action, but the source file is read a line at a time and the lines are expanded. It does not support source directories, but the destination can be a directory. 2. Enhanced ocamltest to have a facility for making builtin functions. These are like ocamltest variables but have a function attached to them, and when the variable is expanded, the arguments are first expanded and then the function is called and its result is returned. Currently they only take one parameter, but it should not be too hard to add the ability to have multiply arguments. 3. Added a new "dumpenv_expanded" action which not only shows the value of each variable, but also what they expand to. When variables are originally assigned, their RHS is not expanded, but only later when the variable itself is looked up. 4. Two builtin functions were defined: - bppm_decode does BUILD_PATH_PREFIX_MAP decoding - bppm_encode does BUILD_PATH_PREFIX_MAP encoding See https://reproducible-builds.org/specs/build-path-prefix-map/ 5. Prepare to be able to do Dune tests: 5a. Added a action helper for testing if a program is available in PATH. 5b. Use the helper to make an action, has_dune, that detects whether dune is available.
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase, but use a different variable, DEPLOY_PATH_PREFIX_MAP. It must be setup with the logical inverse of the mapping used during the build phase. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 2. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the DEPLOY_PATH_PREFIX_MAP environment variable. So "set mapping value" will set DEPLOY_PATH_PREFIX_MAP, and "show mapping" will show the value of DEPLOY_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 3. debugger/source.ml In source_of_module, use the Location.rewrite_exists. 4. debugger/symbols.{ml, mli} a. Avoid adding directories redundantly b. Use Location.matching_dirs to expand directories. 5. Man page and manual updated. Fixes ocaml#12083
The BUILD_PATH_PREFIX_MAP specification tells how to use that environment variable to achieve reproducible build, i.e. builds of products that do not leak absolute paths from the build environment. See https://reproducible-builds.org/specs/build-path-prefix-map. However, that specification only describes half of the story. Let us call the building of reproducible products the "Build Phase". That is the phase covered by the existing specification. Let us defined the "Deployment phase" as the phase where you accept a built reproducible product and make use of it, i.e. deploy it. An example would be the debugger taking a reproducible binary and letting the user debug it, showing the user the source code, etc. We use the same mechanism in the deployment phase, but use a different variable, DEPLOY_PATH_PREFIX_MAP. It must be setup with the logical inverse of the mapping used during the build phase. Also, this is part of a larger PR, ocaml#12085, that is being split up because it was too large. In addition to these changes, that PR includes ocamltest enhancements plus the tests for these changes. See it for prior discussion and the other changes. I will now describe the essence of the changes. 1. bytecomp/emitcode.ml Added sanitizing of paths in the debug events. If the compiler is given absolute source paths this was leaking absolute build paths. 2. debugger/command_line.ml A debugger variable, "mapping", was added, which is tied to the DEPLOY_PATH_PREFIX_MAP environment variable. So "set mapping value" will set DEPLOY_PATH_PREFIX_MAP, and "show mapping" will show the value of DEPLOY_PATH_PREFIX_MAP. This allows the user to set it programmaticaly, possibly from the ".ocamldebug" file. 3. debugger/source.ml In source_of_module, use the Location.rewrite_exists. 4. debugger/symbols.{ml, mli} a. Avoid adding directories redundantly b. Use Location.matching_dirs to expand directories. 5. Man page and manual updated. Updated new tests to take into account: 1. The new ocamltest syntax that is C-like rather than org-mode like. 2. The debugger now uses DEPLOY_PATH_PREFIX_MAP for its inverse mapping, rather than BUILD_PATH_PREFIX_MAP. Fixes ocaml#12083
This is part 2 of a larger PR, ocaml#12085, which includes a part 1 (compiler and debugger changes), this part, and a third part which is tests of part 1 that make use of (and hence motivate and test) these changes. Please see that PR for example uses of these changes. 1. Add the ability to have scripts or files with ocamltest variables references expanded. Added a new "expand" action to ocamltest. It is mostly like the "copy" action, but the source file is read a line at a time and the lines are expanded. It does not support source directories, but the destination can be a directory. 2. Enhanced ocamltest to have a facility for making builtin functions. These are like ocamltest variables but have a function attached to them, and when the variable is expanded, the arguments are first expanded and then the function is called and its result is returned. Currently they only take one parameter, but it should not be too hard to add the ability to have multiply arguments. 3. Added a new "dumpenv_expanded" action which not only shows the value of each variable, but also what they expand to. When variables are originally assigned, their RHS is not expanded, but only later when the variable itself is looked up. 4. Two builtin functions were defined: - bppm_decode does BUILD_PATH_PREFIX_MAP decoding - bppm_encode does BUILD_PATH_PREFIX_MAP encoding See https://reproducible-builds.org/specs/build-path-prefix-map/ 5. Prepare to be able to do Dune tests: 5a. Added a action helper for testing if a program is available in PATH. 5b. Use the helper to make an action, has_dune, that detects whether dune is available.
In order to produce reproducible builds (independent of the location
of the build), starting from Dune 3.0, Dune has mapped references to
the workspace directory to “/workspace_root”.
This change enables ocamldebug to recover the locations of the source
code. The ocamldebug user must set the BUILD_PATH_PREFIX_MAP
environment variable to effect the mapping.
If we only want to map "/workspace_root" to a single directory, say
mydir, we would do:
export BUILD_PATH_PREFIX_MAP="mydir=/workspace_root"
If we want to map to multiple directories, we must separate
them, and the trailing part of the path, with ":', but
in a BUILD_PATH_PREFIX_PATH these are encoded as "%.".
So to map to mydir1 and mydir2, the setting would be:
export BUILD_PATH_PREFIX_MAP="mydir1%.mydir2%.=/workspace_root"
Then if ocamldebug has a path like
"/workspace_root/stuff",
The initial mapping takes this to
"mydir1:mydir2:/stuff"
Then ocamldebug will convert this to the list
["mydir1/stuff" "mydir2/stuff"]
Fixes #12083