Always emit non-logical newlines for 'empty' lines#27
Always emit non-logical newlines for 'empty' lines#27charliermarsh merged 1 commit intoRustPython:mainfrom
Conversation
|
\cc @MichaReiser |
| } | ||
| Some('\n' | '\r') => { | ||
| // Empty line! | ||
| let tok_start = self.get_pos(); |
There was a problem hiding this comment.
Unrelated to your changes. I think this emits two newlines when using \r\n instead of one.
There was a problem hiding this comment.
I think this emits two newlines when using \r\n instead of one
yeah, that should be the case, could probably special case this by looking at the next char and just advancing but can't remember if that caused issues when I was tweaking this a while ago.
There was a problem hiding this comment.
I think it might actually work correctly? I think next_char already does the advancement:
// Helper function to go to the next character coming up.
fn next_char(&mut self) -> Option<char> {
let mut c = self.window[0];
self.window.slide();
match c {
Some('\r') => {
if self.window[0] == Some('\n') {
self.location += TextSize::from(1);
self.window.slide();
}
self.location += TextSize::from(1);
c = Some('\n');
}
#[allow(unused_variables)]
Some(c) => {
self.location += c.text_len();
}
_ => {}
}
c
}There was a problem hiding this comment.
ah, I had forgotten about that 😄 Test would also probably fail if this was the this case.
youknowone
left a comment
There was a problem hiding this comment.
The change looks reasonable.
Due to lack of tests of this repository,
You may want to make a ruff port before merging this PR.
Please feel free to merge it when you ready to go.
|
Ruff PR changed: astral-sh/ruff#4438 |
4a738f0 to
e1f408e
Compare
e1f408e to
66ccbc8
Compare
Summary
Right now, if you have a comment like:
# fooThe lexer emits a comment, but no newline. It turns out that if the lexer encounters an "empty" line, we skip the newline emission, and a comment counts as an "empty" line (see:
eat_indentation, where we eat indentation and comments).This PR modifies the lexer to emit a
NonLogicalNewlinein such cases. As a result, we'll now always have either a newline or non-logical newline token at the end of a line (excepting continuations). I believe this is more consistent with CPython. For example, given this snippet:CPython outputs:
Note the
NLtokens after the comment, and for the empty line, along with theNLtoken at the end prior to the dedent.