Skip to content

Fix exponential parser time on sequence of [[[[#10439

Merged
sophiajt merged 2 commits intonushell:mainfrom
anka-213:anka-213/issue10438
Sep 20, 2023
Merged

Fix exponential parser time on sequence of [[[[#10439
sophiajt merged 2 commits intonushell:mainfrom
anka-213:anka-213/issue10438

Conversation

@anka-213
Copy link
Copy Markdown
Contributor

@anka-213 anka-213 commented Sep 20, 2023

Description

Before this change, parsing [[[[[[[[[[[[[[[[[[[[[[ would cause nushell to consume several gigabytes of memory, now it should be linear in time.

The old code first tried parsing the head of the table as a list and then after that it checked if it got more arguments. If it didn't, it throws away the previous result and tries to parse the whole thing as a list, which means we call parse_list_expression twice for each call to parse_table_expression, resulting in the exponential growth

The fix is to simply check that we have all the arguments we need before parsing the head of the table, so we know that we will either call parse_list_expression only on sub-expressions or on the whole thing, never both.

Fixes #10438

User-Facing Changes

Should give a noticable speedup when typing a sequence of [[[[[[ open brackets

Tests + Formatting

I would like to add tests, but I'm not sure how to do that without crashing CI with OOM on regression

  • Don't forget to add tests that cover your changes.
  • cargo fmt --all -- --check to check standard code formatting (cargo fmt --all applies these changes)
  • cargo clippy --workspace -- -D warnings -D clippy::unwrap_used to check that you're using the standard code style
  • cargo test --workspace to check that all tests pass (on Windows make sure to enable developer mode)
  • cargo run -- -c "use std testing; testing run-tests --path crates/nu-std" to run the tests for the standard library

After Submitting

If your PR had any user-facing changes, update the documentation after the PR is merged, if necessary. This will help us keep the docs up to date.

The old code first tried parsing the head of the table as a
list and then after that it checked if it got more arguments.
If it didn't, it throws away the previous result and tries to
parse the whole thing as a list, which means we call
parse_list_expression twice for each call to
parse_table_expression, resulting in the exponential growth

The fix is to simply check that we have all the arguments
we need before parsing the head of the table, so we know
that we will either call parse_list_expression only on
sub-expressions or on the whole thing, never both.

Fixes nushell#10438
@anka-213
Copy link
Copy Markdown
Contributor Author

I've added a test now, but it will just freeze on regression instead of giving an error. I would like to add a timeout, but I don't know how to do that. Maybe it's only available as a global setting for the full testsuite?

@sophiajt
Copy link
Copy Markdown
Contributor

Wow, nice catch

@sophiajt sophiajt merged commit 8d8b443 into nushell:main Sep 20, 2023
hardfau1t pushed a commit to hardfau1t/nushell that referenced this pull request Dec 14, 2023
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

Before this change, parsing `[[[[[[[[[[[[[[[[[[[[[[` would cause nushell
to consume several gigabytes of memory, now it should be linear in time.

The old code first tried parsing the head of the table as a list and
then after that it checked if it got more arguments. If it didn't, it
throws away the previous result and tries to parse the whole thing as a
list, which means we call `parse_list_expression` twice for each call to
`parse_table_expression`, resulting in the exponential growth

The fix is to simply check that we have all the arguments we need before
parsing the head of the table, so we know that we will either call
parse_list_expression only on sub-expressions or on the whole thing,
never both.

Fixes nushell#10438


# User-Facing Changes
Should give a noticable speedup when typing a sequence of `[[[[[[` open
brackets
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting

I would like to add tests, but I'm not sure how to do that without
crashing CI with OOM on regression

- [x] Don't forget to add tests that cover your changes.
- [x] `cargo fmt --all -- --check` to check standard code formatting
(`cargo fmt --all` applies these changes)
- [x] `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used`
to check that you're using the standard code style
- [x] `cargo test --workspace` to check that all tests pass (on Windows
make sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- [x] `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library
<!--
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parser time and memory use grows exponentially on sequence of [[[[

2 participants