Reduce use of unchecked indexing in parser code#10381
Reduce use of unchecked indexing in parser code#10381anka-213 wants to merge 23 commits intonushell:mainfrom
Conversation
By replacing spans[idx] with spans.get(idx) calls, we can guarantee that we won't get any Index Out of Bounds panics. Some of the ones I replaced are very obviously checked, but that means that I can replace a check by a pattern match. And the goal is to remove all unchecked indexing in the parser.
|
Thanks for diving into this! Changes like this sound good. I think we need to get some of the resident parser experts at the table, as the long term goal should be that the parser state is correct on top of the invalid acccesses getting handled gracefully (which remains a must for stability). (I smiled at the initial PR description :) ) |
|
I don't even know exactly what I changed, but when I refactored the function that crashed before to use my new safer primitives, it resolved the issue in #9072. I did however notice that the new error message refers to the wrong index, so that's probably related to the cause of the original issue. As you can see now the first test says "missing f", which is true. The second test says "missing e", which is clearly a lie, since "e" is the fourth argument. Well, at least it doesn't crash anymore, even though I haven't found the source of the bug. |
By not using nested references. It does however introduce new complexity at one place, but I'll try to remove that later
This reverts commit 14274a1.
Needed in order to call functions with sub-spans, but keeping shared idx
|
Oh, I see. I hadn't reran all the other tests. Apparently, when I fixed the new test caused 95 other tests to fail. Edit: Found the off-by-one error. I still have 7 test failures remaining and I don't actually know exactly which part of the change fixed the bug. I just made the bug impossible, I didn't find the cause of it. |
If you start a loop by increasing an index, you're going to skip the first element
|
Oh, I figured out part of the original bug, this line nushell/crates/nu-parser/src/parser.rs Line 968 in 3a04bd9 only checks for == end, but in this case we were for some reason looking at two steps from the end. After reading ["true", "true", "1"] at index 2 to get the first float, we try to read ["true", "true"] at index 3 to get the second float, which doesn't make sense.
My change ensures that it actually fails that test and skips to the next step, after which we don't get quite a correct error message, but at least a more sensible one. My guess is that there is a bug in |
|
For that area of the code - the use that we're using should be correct, and if there are panics, we want to see them so we can fix them. You can't get a span idx beyond the end of the array, so for the cases where you can, that shows an error in the code rather than a problem with not being defensive in how we handle the input. |
|
The actual bug was here nushell/crates/nu-parser/src/parser.rs Line 590 in 3a04bd9 It should be checking there are fewer remaining positional args than remaining args, rather than fewer positional args than total args. @jntrnr I see your point, but it would still be preferable for the parser to not crash from a user perspective, so maybe printing a warning in unexpected cases or something like that instead? I'd prefer the confusing message "missing e", despite it being "f" that's missing, over just a full crash. The idea with my changes is not primarily to be more defensive in the code, but more to encode more invariants in the types and so reducing the need for defensive programming down the line. In general, if we can move as much of the finicky index-manipulation out of the main parser code and into generic primitives, we can reduce the risk for tiny bugs like these popping up. Edit: I've added an assert with the old check in the error case now. This is also an improvement because the panic is closer to the source of the error rather than later down the line where invariants were unsatisfied. |
Old code was comparing remaining positional arguments with total number of arguments, where it should've compared remaining positional with with remaining arguments of any kind Fixes nushell#9072
|
Now all the tests passes again. I had a mistake in my translation, where I had accidentally changed |
crates/nu-parser/src/parser.rs
Outdated
| spans.len() | ||
| } else if positional_idx < signature.required_positional.len() | ||
| && spans.len() > (signature.required_positional.len() - positional_idx) | ||
| && spans.len() - spans_idx > (signature.required_positional.len() - positional_idx) |
| Expression { | ||
| expr: Expr::VarDecl(id), | ||
| span: span(&spans[*spans_idx..*spans_idx + 1]), | ||
| span: spans.current(), |
There was a problem hiding this comment.
why were we recreating a single span in a complicated way here?
span(&spans[*spans_idx..*spans_idx + 1]),
is equivalent to
spans[*spans_idx]
as far as I can tell
|
This is ready for review now. I have some plans to keep refactoring more of the parser in the same style, but I might as well do that in a new PR, since this is self-contained as is. |
|
Oops, I forgot that subtraction is not a safe operator with unsigned integers. It should work now |
|
@anka-213 - if you found where the bugfix should be, that'd be a good one to create a PR for. We're still pre-1.0. Users who are using Nushell now are helping us test and get it ready for 1.0 (that's why we put the disclaimer in the README). I get that it's annoying as a user to hit issues, but we want folks to file issues. If it's just a warning, I think some people may not even see it as it might be part of a script they're running. Not saying I want the parser to crash. I just think we need to be hammering it into shape. Finding issues like the one you found, fixing them, and repeating until we can't find any anymore. |
|
@jntrnr I've extracted the bug fix into #10395 now. Yeah, I see your point and I have changed the code to as much as possible be a refactor rather than permitting more programs than before (which I accidentally did before). The main difference should be in moving crashes closer to the source, by enforcing invariants on construction, rather than something unrelated crashing later down the line because of the broken invariants. I would view it as much more of an extension of the "don't use unwrap" guideline than a case of defensive programming. Using I was already from the start trying to avoid changing any behaviour (just writing the exact same checks as before using new primitives), but now I see that I shouldn't view crashes as a thing to avoid as long as they only happen if there is a bug. Which also makes the refactor simpler, since now any cases that were missing from the old checks can be replaced by a call to panic, instead of trying to figure out what could cause it to happen and how to handle it. |
|
Here's a rather silly crash I found by looking at what assumptions indexing operations have and working backwards from that: as soon as you type the "s" in the final "alias" it crashes with Now, this is probably not something a user should ever do, so this is more of an illustration of broken assumptions than a serious bug. My goal is to make the code less fragile so it's more difficult have incorrect assumptions without noticing. And also hopefully remove a bunch of the defensive programming that is already present in the code. |
# Description Old code was comparing remaining positional arguments with total number of arguments, where it should've compared remaining positional with with remaining arguments of any kind. This means that if a function was given too few arguments, `calculate_end_span` would believe that it actually had too many arguments, since after parsing the first few arguments, the number of remaining arguments needed were fewer than the *total* number of arguments, of which we had used several. Fixes #9072 Fixes: #13930 Fixes: #12069 Fixes: #8385 Extracted from #10381 ## Bonus It also improves the error handling on missing positional arguments before keywords (no longer crashing since #9851). Instead of just giving the keyword to the parser for the missing positional, we give an explicit error about a missing positional argument. I would like better descriptions than "missing var_name" though, but I'm not sure if that's available without Old error ``` Error: nu::parser::parse_mismatch × Parse mismatch during operation. ╭─[entry #1:1:1] 1 │ let = if foo · ┬ · ╰── expected valid variable name ╰──── ``` New error ``` Error: nu::parser::missing_positional × Missing required positional argument. ╭─[entry #18:1:1] 1 │ let = foo · ┬ · ╰── missing var_name ╰──── help: Usage: let <var_name> = <initial_value> ``` # User-Facing Changes The program `alias = = =` is no longer accepted by the parser
This draft is currently just an example of the kind of changes that would be done to the rest of the parser.
Description
By replacing
spans[idx]withspans.get(idx)calls, we can guarantee that we won't get any Index Out of Bounds panics, since we get backOption<Span>where we'll need to actually check that it was valid.Some of the ones I replaced are very obviously checked, but that means that I can replace a check by a pattern match. And the goal is to remove all unchecked indexing in the parser.
Further ideas:
User-Facing Changes
None. Just refactoring and preventing crashes.
Tests + Formatting
cargo fmt --all -- --checkto check standard code formatting (cargo fmt --allapplies these changes)cargo clippy --workspace -- -D warnings -D clippy::unwrap_usedto check that you're using the standard code stylecargo test --workspaceto check that all tests pass (on Windows make sure to enable developer mode)cargo run -- -c "use std testing; testing run-tests --path crates/nu-std"to run the tests for the standard libraryAfter Submitting
No user-facing changes
More info
The approach is inspired by the parse don't validate idea, where instead of first having a validation step where we check that the data is valid and then later construct data with implicit assumptions about the data, we combine these two into a single step, where we try to construct the data and if the data was invalid, we get a
Noneback.