Skip to content

Change how and and or operations are compiled to IR to support custom values#14653

Merged
fdncred merged 1 commit intonushell:mainfrom
devyn:fix-and-or-ops-to-support-non-boolean-values
Dec 25, 2024
Merged

Change how and and or operations are compiled to IR to support custom values#14653
fdncred merged 1 commit intonushell:mainfrom
devyn:fix-and-or-ops-to-support-non-boolean-values

Conversation

@devyn
Copy link
Copy Markdown
Contributor

@devyn devyn commented Dec 21, 2024

Description

Because and and or are short-circuiting operations in Nushell, they must be compiled to a sequence that avoids evaluating the RHS if the LHS is already sufficient to determine the output - i.e., false for and and true for or. I initially implemented this with branch-if instructions, simply returning the RHS if it needed to be evaluated, and returning the short-circuited boolean value if it did not.

Example for $a and $b:

   0: load-variable          %0, var 999 "$a"
   1: branch-if              %0, 3
   2: jump                   5
   3: load-variable          %0, var 1000 "$b" # label(0), from(1:)
   4: jump                   6
   5: load-literal           %0, bool(false) # label(1), from(2:)
   6: span                   %0          # label(2), from(4:)
   7: return                 %0

Unfortunately, this broke polars, because using and/or on custom values is perfectly valid and they're allowed to define that behavior differently, and the polars plugin uses this for boolean masks. But without using the binary-op instruction, that custom behavior is never invoked. Additionally, branch-if requires a boolean, and custom values are not booleans. This changes the IR to the following, using the match instruction to check for the specific short-circuit value instead, and still invoking binary-op otherwise:

   0: load-variable          %0, var 125 "$a"
   1: match                  (false), %0, 4
   2: load-variable          %1, var 124 "$b"
   3: binary-op              %0, Boolean(And), %1
   4: span                   %0          # label(0), from(1:)
   5: return                 %0

I've also renamed Pattern::Value to Pattern::Expression and added a proper Pattern::Value variant that actually contains a Value instead. I'm still hoping to remove Pattern::Expression eventually, because it's kind of a hack - we don't actually evaluate the expression, we just match it against a few cases specifically for pattern matching, and it's one of the cases where AST leaks into IR and I want to remove all of those cases, because AST should not leak into IR.

Fixes #14518

User-Facing Changes

  • and and or now support custom values again.
  • the IR is actually a little bit cleaner, though it may be a bit slower; match is more complex.

Tests + Formatting

The existing tests pass, but I didn't add anything new. Unfortunately I don't think there's anything built-in to trigger this, but maybe some testcases could be added to polars to test it.

…stom values

Because `and` and `or` are short-circuiting operations in Nushell, they must be compiled to a sequence that avoids evaluating the RHS if the LHS is already sufficient to determine the output - i.e., `false` for `and` and `true` for `or`. I initially implemented this with `branch-if` instructions, simply returning the RHS if it needed to be evaluated, and returning the short-circuited boolean value if it did not.

Example for `$a and $b`:

```
   0: load-variable          %0, var 999 "$a"
   1: branch-if              %0, 3
   2: jump                   5
   3: load-variable          %0, var 1000 "$b" # label(0), from(1:)
   4: jump                   6
   5: load-literal           %0, bool(false) # label(1), from(2:)
   6: span                   %0          # label(2), from(4:)
   7: return                 %0
```

Unfortunately, this broke polars, because using `and`/`or` on custom values is perfectly valid and they're allowed to define that behavior differently, and the polars plugin uses this for boolean masks. But without using the `binary-op` instruction, that custom behavior is never invoked. Additionally, `branch-if` requires a boolean, and custom values are not booleans. This changes the IR to the following, using the `match` instruction to check for the specific short-circuit value instead, and still invoking `binary-op` otherwise:

```
   0: load-variable          %0, var 125 "$a"
   1: match                  (false), %0, 4
   2: load-variable          %1, var 124 "$b"
   3: binary-op              %0, Boolean(And), %1
   4: span                   %0          # label(0), from(1:)
   5: return                 %0
```

I've also renamed `Pattern::Value` to `Pattern::Expression` and added a proper `Pattern::Value` variant that actually contains a `Value` instead. I'm still hoping to remove `Pattern::Expression` eventually, because it's kind of a hack - we don't actually evaluate the expression, we just match it against a few cases specifically for pattern matching, and it's one of the cases where AST leaks into IR and I want to remove all of those cases, because AST should not leak into IR.

Fixes nushell#14518
@devyn
Copy link
Copy Markdown
Contributor Author

devyn commented Dec 21, 2024

Working with polars:

> ([true, false] | polars into-df) and ([true, true] | polars into-df)
╭───┬─────────╮
 # │ and_0_0 │
├───┼─────────┤
 0  true    
 1  false   
╰───┴─────────╯
> view ir { ([true, false] | polars into-df) and ([true, true] | polars into-df) }
# 3 registers, 18 instructions, 0 bytes of data
   0: load-literal           %0, list(capacity = 2)
   1: load-literal           %1, bool(true)
   2: list-push              %0, %1
   3: load-literal           %1, bool(false)
   4: list-push              %0, %1
   5: redirect-out           value
   6: call                   decl 491 "polars into-df", %0
   7: match                  (false), %0, 16
   8: load-literal           %1, list(capacity = 2)
   9: load-literal           %2, bool(true)
  10: list-push              %1, %2
  11: load-literal           %2, bool(true)
  12: list-push              %1, %2
  13: redirect-out           value
  14: call                   decl 491 "polars into-df", %1
  15: binary-op              %0, Boolean(And), %1
  16: span                   %0          # label(0), from(7:)
  17: return                 %0

@fdncred fdncred merged commit 35d2750 into nushell:main Dec 25, 2024
@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Dec 25, 2024

Thanks

@github-actions github-actions bot added this to the v0.102.0 milestone Dec 25, 2024
WindSoilder added a commit that referenced this pull request Apr 25, 2025
…tor (#15623)

# Description
Fixes: #15510
I think it's introduced by #14653, which changes `and/or` to `match`
expression.

After looking into `compile_match`, it's important to collect the value
before matching this.
```rust
    // Important to collect it first
    builder.push(Instruction::Collect { src_dst: match_reg }.into_spanned(match_expr.span))?;
```
This pr is going to apply the logic while compiling `and/or` operation.

# User-Facing Changes
The following will raise a reasonable error:
```nushell
> (nu --testbin cococo false) and true
Error: nu::shell::operator_unsupported_type

  × The 'and' operator does not work on values of type 'string'.
   ╭─[entry #7:1:2]
 1 │ (nu --testbin cococo false) and true
   ·  ─┬                         ─┬─
   ·   │                          ╰── does not support 'string'
   ·   ╰── string
   ╰────
```

# Tests + Formatting
Added 1 test.

# After Submitting
Maybe need to update doc
nushell/nushell.github.io#1876

---------

Co-authored-by: Stefan Holderbach <sholderbach@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dataframe combining logical masks

2 participants