shellIntegrationAddon: fix broken deserializeMessage() implementation + add tests#165635
Merged
Tyriar merged 3 commits intomicrosoft:mainfrom Nov 28, 2022
Merged
shellIntegrationAddon: fix broken deserializeMessage() implementation + add tests#165635Tyriar merged 3 commits intomicrosoft:mainfrom
deserializeMessage() implementation + add tests#165635Tyriar merged 3 commits intomicrosoft:mainfrom
Conversation
deserializeMessage() implementation + add tests
This was referenced Nov 6, 2022
Merged
shellIntegration-rc.zsh: escape values in "E" (executed command) and "P" (property KV) codes
#165633
Merged
846a1ea to
f33e9dd
Compare
f33e9dd to
9d45c75
Compare
Contributor
Author
|
The force-pushes here (and linked PRs) are just no-op rebases to clear the "branch out of date" gate. |
…e implementation The test suite includes several skipped test cases for bugs which are fixed in subsequent commits.
…t of string' deserializeMessage() would not recognize escape sequences at the start of a string, due to a simple falsey check on `match?.index` which could be validly 0 instead of null. In other words: "Foo\x3bBar" would decode to "Foo;Bar", but "\x3b" would not be recognized as an escape sequence. This fixes that bug, enabling those suppressed test cases, and disables the newly-failing tests which had only passed due to the (still-broken) escaping logic being skipped in those situations.
The original implementation confused escaped and un-escaped sequences due to squashing each adjacent pair of backslashes prior to parsing. Unlike encoding, decoding cannot be performed in passes this way: it must be sequential, because of the potential for overlapping patterns (e.g. the third backslash in "\\\x3b"). This replaces the implementation with a single sequential regex, which matches the escape sequences and replaces each match. This enables the tests that were broken in the previous implementation. To illustrate, given the following two different original values (here as literals between «», without any escaping): a. «Packing\Stuff\x3BEarmuffs» b. «Packing\Stuff;Earmuffs» Those should be distinctly encoded as: a. «Packing\\Stuff\\x3BEarmuffs» b. «Packing\\Stuff\x3BEarmuffs» The original implementation wrongly threw away the escaping information by replacing each adjacent pairs of backslashes with a single backslash, regardless of whether escaping was in effect or not: a. «Packing\Stuff\x3BEarmuffs» b. «Packing\Stuff\x3BEarmuffs» The new implementation matches, in (a), each "\\"; and there are no non-overlapping "\x…" sequences. In (b), the first "\\" is matches, and the "\x3B" is matched. This works for any correct combination of adjacent escape squences.
9d45c75 to
7d99990
Compare
Tyriar
approved these changes
Nov 23, 2022
Contributor
Tyriar
left a comment
There was a problem hiding this comment.
Great PR, thanks for noticing my broken code 😉
rzhao271
approved these changes
Nov 23, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
deserializeMessage()implementation was broken and did not contain useful tests.This adds tests (initially failing) and fixes the main correctness issues:
deserializeMessage()would not recognize escape sequences at the start of a string, due to a mistaken falsey check onmatch?.indexwhich could be validly0instead ofnull.In other words:
"Foo\x3bBar"would decode to"Foo;Bar", but the string"\x3bBar"would not be recognized as containing an escape sequence.deserializeMessage()confused escaped and un-escaped sequences due to squashing each adjacent pair of backslashes prior to parsing.Unlike encoding, decoding cannot be performed in passes: it must be sequential across the string, because of the potential for overlapping patterns (e.g. the third backslash in
\\\x3b). The original implementation wrongly threw away the escaping information by replacing each adjacent pairs of backslashes with a single backslash, regardless of whether escaping was in effect on those backslashes or not.This replaces the implementation with a single sequential regex, which matches the escape sequences and replaces each match.
This PR leaves unresolved the incorrect handling of multibyte characters, however, which is the only remaining deficiency I'm aware of.
This relates to #155639. Certainly there must be other bugs filed caused by this, too, but since the shell integrations themselves don't perform escaping (see following PRs), it's hard to tell which issues are caused by the interpretation here vs. the lack of escaping there.
PRs implementing the escaping scheme for various shells: