Skip to content

Fix various bugs found via running the test suite#11611

Merged
dhruvmanila merged 2 commits intodhruv/parser-phase-2from
dhruv/bugfixes
May 30, 2024
Merged

Fix various bugs found via running the test suite#11611
dhruvmanila merged 2 commits intodhruv/parser-phase-2from
dhruv/bugfixes

Conversation

@dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented May 30, 2024

Summary

This PR fixes various bugs found when running the test suite. Note that it doesn't update the code to make it compile which is done in a separate PR.

Refer to review comments for each individual bug.

@dhruvmanila dhruvmanila added the bug Something isn't working label May 30, 2024
@dhruvmanila dhruvmanila force-pushed the dhruv/bugfixes branch 2 times, most recently from a40012c to c137200 Compare May 30, 2024 10:04
Base automatically changed from dhruv/tokens-range-query to dhruv/parser-phase-2 May 30, 2024 10:05
}
if checker.enabled(Rule::PrintfStringFormatting) {
pyupgrade::rules::printf_string_formatting(checker, format_string, right);
pyupgrade::rules::printf_string_formatting(checker, bin_op, format_string);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function requires the range and the right side of the binary expression so we let's just pass the binary expression as a whole.

Comment on lines 110 to 116
let (a_range, b_range) = match (a_token.kind(), b_token.kind()) {
(TokenKind::String, TokenKind::String) => (a_token.range(), b_token.range()),
(TokenKind::String, TokenKind::FStringStart) => {
match indexer.fstring_ranges().innermost(a_token.start()) {
match indexer.fstring_ranges().innermost(b_token.start()) {
Some(b_range) => (a_token.range(), b_range),
None => continue,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must've been a copy paste mistake.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That must have been annoying to find.

Comment on lines -476 to +478
TokenKind::Async if matches!(self.tokens.peek(), Some((TokenKind::Def, _))) => {
LogicalLineKind::Function
continue;
}
TokenKind::Import => LogicalLineKind::Import,
TokenKind::From => LogicalLineKind::FromImport,
_ => LogicalLineKind::Other,
};

first_logical_line_token = Some((logical_line_kind, range));
is_docstring = kind == TokenKind::String;

let logical_line_kind = match kind {
TokenKind::Class => LogicalLineKind::Class,
TokenKind::Comment => LogicalLineKind::Comment,
TokenKind::At => LogicalLineKind::Decorator,
TokenKind::Def => LogicalLineKind::Function,
// Lookahead to distinguish `async def` from `async with`.
TokenKind::Async
if self
.tokens
.peek()
.is_some_and(|token| token.kind() == TokenKind::Def) =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this isn't a bug fix, but mostly a change from:

if matches!(self.tokens.peek(), Some((TokenKind::Def, _)))

To

                            if self
                                .tokens
                                .peek()
                                .is_some_and(|token| token.kind() == TokenKind::Def)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is more a compilation error fix?

Comment on lines 111 to 114
diagnostic.set_fix(Fix::safe_edit(Edit::range_replacement(
content,
located_op.range + expr.start(),
located_op.range,
)));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this offset exists was because the lexer function used was lex instead of lex_starts_at.

let mut tok_iter = lexer::lex(&parenthesized_contents, Mode::Expression)
.flatten()
.skip(1)
.map(|(tok, range)| (tok, range - TextSize::from(1)))
.filter(|(tok, _)| !matches!(tok, Tok::NonLogicalNewline | Tok::Comment(_)))
.peekable();

// Reconstruct the source code by skipping any names that are in `members`.
let mut output = String::with_capacity(import_from_stmt.range().len().to_usize());
let mut last_pos = TextSize::default();
let mut last_pos = import_from_stmt.start();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, we use the locator containing the entire source code, so the start offset should be the statement start offset.

Comment on lines -183 to +186
(_, quote @ ('\'' | '"')) => self.try_single_char_prefix(first).then_some(quote),
(_, quote @ ('\'' | '"')) => self.try_single_char_prefix(first).then(|| {
self.cursor.bump();
quote
}),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Forgot to bump the cursor.

Comment on lines 578 to 580
if self.cursor.eat_char2(quote, quote) {
self.current_flags = TokenFlags::TRIPLE_QUOTED_STRING;
self.current_flags |= TokenFlags::TRIPLE_QUOTED_STRING;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ummmm

Comment on lines 739 to 741
if self.cursor.eat_char2(quote, quote) {
self.current_flags = TokenFlags::TRIPLE_QUOTED_STRING;
self.current_flags |= TokenFlags::TRIPLE_QUOTED_STRING;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not sure why I did this xD

Comment on lines 1464 to 1467
/// Returns `true` if the token is a raw string.
const fn is_raw_string(self) -> bool {
self.contains(TokenFlags::RAW_STRING)
self.intersects(TokenFlags::RAW_STRING)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using contains is incorrect here.

For all other changes, I find intersects easier to reason about.

@dhruvmanila dhruvmanila marked this pull request as ready for review May 30, 2024 10:22
@dhruvmanila dhruvmanila requested a review from MichaReiser as a code owner May 30, 2024 10:22
Comment on lines 110 to 116
let (a_range, b_range) = match (a_token.kind(), b_token.kind()) {
(TokenKind::String, TokenKind::String) => (a_token.range(), b_token.range()),
(TokenKind::String, TokenKind::FStringStart) => {
match indexer.fstring_ranges().innermost(a_token.start()) {
match indexer.fstring_ranges().innermost(b_token.start()) {
Some(b_range) => (a_token.range(), b_range),
None => continue,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That must have been annoying to find.

Comment on lines -476 to +478
TokenKind::Async if matches!(self.tokens.peek(), Some((TokenKind::Def, _))) => {
LogicalLineKind::Function
continue;
}
TokenKind::Import => LogicalLineKind::Import,
TokenKind::From => LogicalLineKind::FromImport,
_ => LogicalLineKind::Other,
};

first_logical_line_token = Some((logical_line_kind, range));
is_docstring = kind == TokenKind::String;

let logical_line_kind = match kind {
TokenKind::Class => LogicalLineKind::Class,
TokenKind::Comment => LogicalLineKind::Comment,
TokenKind::At => LogicalLineKind::Decorator,
TokenKind::Def => LogicalLineKind::Function,
// Lookahead to distinguish `async def` from `async with`.
TokenKind::Async
if self
.tokens
.peek()
.is_some_and(|token| token.kind() == TokenKind::Def) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is more a compilation error fix?


impl LogicalLinesIter<'_> {
fn new(tokens: Iter<'_, parser::Token>, verbatim_range: TextRange) -> Self {
impl<'a> LogicalLinesIter<'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think '_ should just work fine here but there's no need to change it.

Comment on lines 48 to 52
pub(crate) fn visit_token(&mut self, token: &Token) {
if matches!(token.kind(), TokenKind::String | TokenKind::FStringStart) {
if matches!(token.kind(), TokenKind::String | TokenKind::FStringMiddle) {
if token.is_triple_quoted_string() {
self.ranges.push(token.range());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bug in main too or something you introduced by your refactor?

@dhruvmanila dhruvmanila merged commit ba4d85c into dhruv/parser-phase-2 May 30, 2024
@dhruvmanila dhruvmanila deleted the dhruv/bugfixes branch May 30, 2024 10:52
dhruvmanila added a commit that referenced this pull request May 31, 2024
## Summary

This PR fixes various bugs found when running the test suite. Note that
it doesn't update the code to make it compile which is done in a
separate PR.

Refer to review comments for each individual bug.
dhruvmanila added a commit that referenced this pull request Jun 3, 2024
## Summary

This PR fixes various bugs found when running the test suite. Note that
it doesn't update the code to make it compile which is done in a
separate PR.

Refer to review comments for each individual bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants