[Yaml] Fix parsing of unquoted multiline scalars with comments or blank lines#62359
Conversation
|
Thank you @yoeunes. |
| next line | ||
|
|
||
| # this comment should be ignored | ||
| final line |
There was a problem hiding this comment.
I am not sure this is a valid example, all online parser that I found terminate parsing the value as soon as they stumble upon a comment line (see https://yaml-online-parser.appspot.com/?yaml=key%3A+foo%0A+bar%0A+%23+m%0A+foobar%0Akey2%3A+baz&type=json for such an example).
There was a problem hiding this comment.
Hi @xabbuh, thank you for your feedback on the spec compliance (3.2.3.3).
I've opened a new PR to address this properly: #62409 which implements what I believe is the correct, spec-compliant behavior.
What do you think of this new approach? Is this the right way to go, or do you think we should prefer to keep a more permissive (doc-aligned) behavior?
Comments are ignored by the YAML parser and do not need to be indented according to the current level of nesting in a collection.
…or comments (yoeunes) This PR was merged into the 6.4 branch. Discussion ---------- [Yaml] Align unquoted multiline scalar parsing with spec for comments | Q | A | ------------- | --- | Branch? | 6.4 | Bug fix? | yes | New feature? | no | Deprecations? | no | Issues | - | License | MIT This PR is a follow-up to [#62359](#62359 (comment)) based on feedback from `@xabbuh` regarding spec compliance. My previous PR fixed a `ParseException` on blank lines, but it handled comments incorrectly (it **ignored** them inside the scalar). This PR aligns the parser with the YAML spec ([3.2.3.3](https://yaml.org/spec/1.2.2/#3233-comments)): 1. It **keeps the fix** for blank lines (they are correctly preserved). 2. It **changes comment handling**: instead of ignoring an indented comment, the parser now **terminates** the multiline scalar when it encounters one. Commits ------- 7911644 [Yaml] Align unquoted multiline scalar parsing with spec for comments
…lars (yoeunes) This PR was merged into the 6.4 branch. Discussion ---------- [Yaml] Fix regression handling blank lines in unquoted scalars | Q | A | ------------- | --- | Branch? | 6.4 | Bug fix? | yes | New feature? | no | Deprecations? | no | Issues | Fix #62581 | License | MIT This PR fixes a regression introduced in #62359 regarding unquoted multiline scalars. A check for blank lines was incorrectly added, causing the parser to bypass indentation checks and consume trailing newlines. This resulted in incorrect line numbers being reported in exceptions (e.g. when a colon is used in an unquoted value). Commits ------- 1480a23 [Yaml] Fix regression handling blank lines in unquoted scalars
This PR fixes two bugs when parsing unquoted multiline scalars:
ParseException.For example, the following YAML would previously fail or parse incorrectly: