Skip to content

Fix ocamltest variable handling#13961

Merged
gasche merged 3 commits intoocaml:trunkfrom
damiendoligez:fix-13941-ocamltest-variables
May 14, 2025
Merged

Fix ocamltest variable handling#13961
gasche merged 3 commits intoocaml:trunkfrom
damiendoligez:fix-13941-ocamltest-variables

Conversation

@damiendoligez
Copy link
Copy Markdown
Member

How it works before this PR

Variables are handled with two data structures:

  • a global hash table of know variables
  • a map that stores the environment during script execution.

The map is persistent and handled as expected in a stack-like fashion following the tree structure of the test script. The global hash table, on the other hand, is a mutable data structure where variables are added but never removed.

A variable has two possible states in the global table: registered or unregistered.

In the environment, a variable has three possible states:

  • string: defined to a string value
  • null: defined to a null value
  • absent: not found in the map.

The possible states of a variable are:

  • registered/string r/str
  • registered/null r/nul
  • registered/absent r/abs
  • unregistered/absent u/abs
  • unregistered/null u/nul

There are 4 operations that a test script can do on a variable: set, =, +=, unset. They change a variale's state as follows (where * denotes an error).

set:

  • r/str -> *
  • r/nul -> *
  • r/abs -> *
  • u/nul -> r/str
  • u/abs -> r/str

=, +=:

  • r/str -> r/str
  • r/nul -> r/str
  • r/abs -> r/str
  • u/abs -> *
  • u/nul -> *

unset:

  • r/str -> r/nul
  • r/nul -> r/nul
  • r/abs -> r/nul
  • u/abs -> u/nul
  • u/nul -> u/nul

There are two kinds of variables: built-in variables and user variables. Built-in variables are preloaded in the global hash table, while user variables are inserted on demand by set. If a built-in variable is used while absent from the environment, it will get a default value, computed at the time of use.

The problem is that the state r/abs represents two things:

  • Built-in variables that were not modified by the script. set cannot be used on these.
  • User variables defined in some other branch of the script. set should be usable on these variables, but it isn't.

The fix

If we stop registering user variables, we remove the problem of user variables that were registered in another branch of the script. We get u/str as a new possible state and the following transitions:

set:
r/str -> *
r/nul -> *
r/abs -> *
u/abs -> u/str
u/nul -> *
u/str -> *

=, +=:
r/str -> r/str
r/nul -> r/str
r/abs -> r/str
u/abs -> *
u/nul -> u/str
u/str -> u/str

unset:
r/str -> r/nul
r/nul -> r/nul
r/abs -> r/nul
u/abs -> u/nul
u/nul -> u/nul
u/str -> u/nul

Note: I removed the set: u/nul -> u/set transition that allowed the user to set a user variable that was already unset; it is quite clear from the code that it was a bug. The intent was that a variable should be initially set or unset, then modified with =, +=, or unset.

Alternate design

A more proper fix would be to explicitly insert the built-in variables in the environment (as a 4th state, "default") and make the primitives act purely on the environment. This would be a more invasive fix, with a greater change of breakage.

fixes #13941

@damiendoligez damiendoligez self-assigned this Apr 15, 2025
damiendoligez added a commit to damiendoligez/ocaml that referenced this pull request Apr 15, 2025
@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 15, 2025

This sounds reasonable, with your proposition the u/r status never changes, so the system is easier to think about.

Question: why is there a distinction between abs and nul? (What happens if we try to access a nul variable, do we get an empty string? Can you add a table such as yours for a read/access operation?)

@gasche
Copy link
Copy Markdown
Member

gasche commented Apr 15, 2025

I had understood from your explanation that builtin variables are evaluated on demand:

If a built-in variable is used while absent from the environment, it will get a default value, computed at the time of use.

I wasn't able to find this in the ocamltest source. My understanding is that this is in fact not the case: actions will typically define some of the builtin variables that are relevant, and then the include form can define additional variables via the "modifiers" mechanism.

If my intuition above is correct, then I think that a clearer mental model is as follows:

  • the Variables table contains variables that are reserved, and may or may not be defined in the current environment
  • the variables that are defined are exactly those present in env

The code prevents user from defining via set a variable that is reserved, but it can be assigned via =.

I agree that reserving variables when they are first set is a bug in trunk, because it prevents setting them in parallel branches.

let var = Variables.find_or_make variable_name in
match (Variables.find_variable variable_name,
Environments.is_variable_defined var env,
decl)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is obscure to me (for example: if we just called find_or_make, why are we calling find again?).

Given your explanation in the PR, taking into account the new logic of never registering user variables, I think that the meaning of each component is as follows:

  • Variables.find_variable tells us whether a variable is reserved (but it may or may not be set)
  • Environments.is_variable_defined tests if the variable is actually defined (not abs, in your terminology)

This is actually a case where labeled tuples could be put to good use (untested)!

let reserved = Option.is_some (Variables.find_variable variable_name) in
let defined = Environments.is_variable_defined var env in
match (~reserved, ~defined, ~decl) with
| (~reserved:false, ~defined:false, ~decl:true) ->
  Environments.add var value env
| (~reserved:_, ~defined:_, ~decl:true) ->
  raise (Variables.Variable_already_registered variable_name)
| (~reserved:false, ~defined:false, ~decl:false) ->
  raise (Variables.No_such_variable variable_name)
| (~reserved:_, ~defined:_, ~decl:false) ->
  Environments.add var value env

Or a simpler approach:

let reserved = Option.is_some (Variables.find_variable variable_name) in
let defined = Environments.is_variable_defined var env in
if decl then begin
  if reserved || defined then raise (Variables.Variable_already_registered variable_name)
end else begin
  if not (reserved || defined) then raise (Variables.No_such_variable variable_name)
end;
Environments.add var value env

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this function along the lines of your simpler approach, and I changed the name find_or_make to from_name to make it less confusing.

@damiendoligez
Copy link
Copy Markdown
Member Author

Question: why is there a distinction between abs and nul? (What happens if we try to access a nul variable, do we get an empty string? Can you add a table such as yours for a read/access operation?)

This is important. The environment is not an environment, but an environment transformer, which will be applied to the environment inherited from the OS. If a variable is abs it will not be modified, but if it is nul it will be removed from the environment. You can see this at work in tests/basic/opt_variants.ml.

Here's your table:

OS environment ocamltest environment environment passed to the test program
VAR=v1 or (nothing) set(v2) VAR=v2
VAR=v1 or (nothing) nul (nothing)
VAR=v1 abs VAR=v1
(nothing) abs (nothing)

@damiendoligez
Copy link
Copy Markdown
Member Author

I had understood from your explanation that builtin variables are evaluated on demand:

If a built-in variable is used while absent from the environment, it will get a default value, computed at the time of use.

I wasn't able to find this in the ocamltest source. My understanding is that this is in fact not the case: actions will typically define some of the builtin variables that are relevant, and then the include form can define additional variables via the "modifiers" mechanism.

If you look in ocaml_actions.ml you will find a few occurrences of Environments.add_if_undefined. This is for variables that are computed on the spot, unless the user decided to override by putting them in the environment. (compiler_reference, compiler_output, ocamldoc_reference, ocamldoc_output, program, program2, output, compiler_output, skip_header_lines, and a couple more in action_helpers.ml) So it's only a few variables that are evaluated on demand, but that's the reason why I went for a minimal patch instead of overhauling the whole environment system.

  • the Variables table contains variables that are reserved, and may or may not be defined in the current environment
  • the variables that are defined are exactly those present in env
    The code prevents user from defining via set a variable that is reserved, but it can be assigned via =.

In fact, you cannot set a variable that has already been set (in trunk: globally; in this PR: in the current branch).

@damiendoligez damiendoligez force-pushed the fix-13941-ocamltest-variables branch from cc1e15b to 1d3252f Compare May 14, 2025 12:54
Copy link
Copy Markdown
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the impression that the changes make the system simpler than before, and also they fix a bug. Approved.

@gasche gasche merged commit c4c20d1 into ocaml:trunk May 14, 2025
33 checks passed
MisterDA pushed a commit to MisterDA/ocaml that referenced this pull request May 16, 2025
@damiendoligez damiendoligez deleted the fix-13941-ocamltest-variables branch August 28, 2025 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong behaviour of environment variables in ocamltest

2 participants