fix: line endings are set based on file#339
Conversation
Added a function to detect line endings in content.
| if (lf > crlf && lf > cr) return '\n'; | ||
| if (cr > crlf && cr > lf) return '\r'; | ||
|
|
||
| return '\n'; |
There was a problem hiding this comment.
Should this default be user definable? Cc @AndrewSouthpaw
There was a problem hiding this comment.
That would ease some of what I mention above. Something like --line-ending-default or something, which we can fall back to (and not throw) in cases of mixed line endings.
There was a problem hiding this comment.
Let's push configurable option to next minor.
| if (crlf > lf && crlf > cr) return '\r\n'; | ||
| if (lf > crlf && lf > cr) return '\n'; | ||
| if (cr > crlf && cr > lf) return '\r'; |
There was a problem hiding this comment.
I am not sure about how we should handle files with mixed endings. Perhaps we should do.
| if (crlf > lf && crlf > cr) return '\r\n'; | |
| if (lf > crlf && lf > cr) return '\n'; | |
| if (cr > crlf && cr > lf) return '\r'; | |
| if (lf + cr === 0 && crlf > 0) return '\r\n'; | |
| if (crlf + cr === 0 && lf > 0) return '\n'; | |
| if (crlf + lf === 0 && cr > 0) return '\r'; |
Thoughts @AndrewSouthpaw
There was a problem hiding this comment.
Hmmm this could be a major change, as I mentioned here.
IMO it'd make sense for this to be behind a flag for v2, and change the default to automatically detect for v3.
For files with mixed line endings, I'm leaning towards throwing, because there's not really a reasonable thing to do. Whatever we choose, it could be the wrong behavior for the user, and quite subtle. We can give a flag for them to choose the override (below).
Printing a warning would be another option, but I'm generally a fan of failing fast, especially given that mixed line endings is probably an indication that something is weird and wrong about their file, and it'd be helpful for them to know about it.
Thoughts?
There was a problem hiding this comment.
I agree #216 would be a major change especially as it uses the system default rather than detecting and it also re-writes the entire doc.
I think it would be ok if document is consistent use what is detected, otherwise use configured which has a default of '\n'.
This would mean:
- \n docs already with upto date toc would always use \n
- \r\n docs already with upto date toc would use config option with default \n. Can Result in mixed doc as per now if not configured otherwise.
- \r\n docs already with toc containing normalised line endings via script would use \r\n, changed behaviour. This would be low likelihood but what user wants as it removes need to add in a normalisation process. Effectively a bug fix.
- \r\n doc without toc would use \r\n, changed behaviour. In this case it is a change but avoids creating a mixed document which I see as a positive.
- \n doc without toc would use \n, no changed behaviour.
So we would only get a chance in a few scenarios. Let me know if you are on the same page.
There was a problem hiding this comment.
Thinking about this some more, I feel we can merge this without the configurable default. That way this can be released as part of 2.4.1 due to it purely being a fix. This is because the only time a user would see a change is if they are post processing the toc to normalize the endings and all they would see is their processing would have no effect and for new docs we are avoiding introducing a line ending mismatch.
| if (crlf > lf && crlf > cr) return '\r\n'; | ||
| if (lf > crlf && lf > cr) return '\n'; | ||
| if (cr > crlf && cr > lf) return '\r'; |
There was a problem hiding this comment.
Hmmm this could be a major change, as I mentioned here.
IMO it'd make sense for this to be behind a flag for v2, and change the default to automatically detect for v3.
For files with mixed line endings, I'm leaning towards throwing, because there's not really a reasonable thing to do. Whatever we choose, it could be the wrong behavior for the user, and quite subtle. We can give a flag for them to choose the override (below).
Printing a warning would be another option, but I'm generally a fan of failing fast, especially given that mixed line endings is probably an indication that something is weird and wrong about their file, and it'd be helpful for them to know about it.
Thoughts?
| if (lf > crlf && lf > cr) return '\n'; | ||
| if (cr > crlf && cr > lf) return '\r'; | ||
|
|
||
| return '\n'; |
There was a problem hiding this comment.
That would ease some of what I mention above. Something like --line-ending-default or something, which we can fall back to (and not throw) in cases of mixed line endings.
|
@AndrewSouthpaw now that #338 is merged this one is good for review. |
Closes: #161
Closes: #216
Tool now detects what line endings have been used in the content and uses that in the generated toc to strive for consistency.
Blocked by: #337 & #338