Handle non-printable characters in diff view#11687
Handle non-printable characters in diff view#11687MichaReiser merged 2 commits intoastral-sh:mainfrom
Conversation
|
MichaReiser
left a comment
There was a problem hiding this comment.
Thank you for looking into this.
I'm a bit surprised that we don't see a single test failure. I think we may also need to make this change in
ruff/crates/ruff_linter/src/message/text.rs
Line 173 in 8a25531
and
| } | ||
|
|
||
| impl ShowNonprinting for &str { | ||
| fn show_nonprinting(&self) -> String { |
There was a problem hiding this comment.
It would be nice if we could return a Cow here to avoid allocating a string if it doesn't contain any non printable characters.
We may be able to do something similar to
https://github.com/mitsuhiko/insta/blob/569bbade9d9ff4bd43fe4138bdcbde00a6bf34c4/insta/src/output.rs#L325-L339
| fn show_nonprinting(&self) -> String { | ||
| let mut output = String::with_capacity(self.len()); | ||
|
|
||
| for byte in self.bytes() { |
There was a problem hiding this comment.
Do I understand this correctly, that this will escape all non-ascii characters (in addition to some ascii characters)? I think that's undesirable. I would be very confused to see ä escaped in a Diff and wouldn't understand where this is coming from.
I recommend focusing on the non printable characters and would consider using unicode characters for them (similar to what insta does see link above) or biomejs (https://github.com/MichaReiser/biome/blob/96bbf507a006274518e8f5ae5af57c56b7771fb0/crates/biome_diagnostics/src/display/frame.rs#L394-L406)
There was a problem hiding this comment.
Yes, that was correct. Fixed in the recent version.
6c99816 to
359080e
Compare
359080e to
434dc95
Compare
434dc95 to
cd1ae4a
Compare
|
Updated the snapshot tests, they look to be much more clear since the |
|
Thanks for contributing! |
MichaReiser
left a comment
There was a problem hiding this comment.
Nice, this is a huge improvement over what we used to have before.
| match change.tag() { | ||
| ChangeTag::Equal => write!(f, " {}", change.value())?, | ||
| ChangeTag::Delete => write!(f, "{}{}", "-".red(), change.value().red())?, | ||
| ChangeTag::Insert => write!(f, "{}{}", "+".green(), change.value().green())?, | ||
| ChangeTag::Equal => write!(f, " {}", change.value().show_nonprinting())?, | ||
| ChangeTag::Delete => write!( | ||
| f, | ||
| "{}{}", | ||
| "-".red(), | ||
| change.value().show_nonprinting().red() | ||
| )?, | ||
| ChangeTag::Insert => write!( | ||
| f, | ||
| "{}{}", | ||
| "+".green(), | ||
| change.value().show_nonprinting().green() | ||
| )?, |
There was a problem hiding this comment.
Nit, I would introduce a variable with the cleaned-up code to reduce the diff size and reduce the risk that one path misses the call.
let change = change.value().show_nonprinting();
match change.tag {
ChangeTag::Equal => write!(f, " {}", change)?,
...
}
Summary
Fixes #10841 by handling non-printable characters in the same way as GNU coreutils cat's
--show-nonprintingflag. One downside of this approach is that unicode characters are no longer printed as their actual character, which could be problematic for non-Latin languages.Test Plan
Tested with the same example in #10841, but would appreciate a pointer to where I can programmatically add some tests for various non-printable ASCII characters.
Example of Unicode Changes
Additional Considerations
As @carljm pointed out, this would make the code diverge from git diff.
Would appreciate any feedback on this quick, slightly hacky diff!