Conversation
…onding .cpp file found #1042
|
Feel free to merge if you're happy with it. |
… for speed reasons. Files without shebang in first line or non UTF-8 encoding in first 1024 bytes get deleted on build.
|
@jdtournier suggested checking for the presence of a shebang line in the file Instead of detecting whether a file is binary. This approach deletes everything that does not have a corresponding .cpp file, is not decodable as UTF-8, or does not start with For giggles, I timed 20 build runs: |
Sounds familiar 😅 👍 Only risk is that if a bash script were added to |
Yeah, I was thinking along the same lines as well. I think I'd be more comfortable with that. |
I knew that rang a bell... 🔔 Otherwise, I'd tend to agree with @Lestropie & @thijsdhollander on the potential for scripts to get clobbered because the developer forgot to include the shebang line, and hadn't committed his changes yet. Note that my previous suggestion of relying on As far as I can tell, the current approach would remove a script that's missing a shebang? Even if it was valid utf8? Maybe it would be better to:
Other idea might be to move 'removed' files to a dedicated folder (say Also: we should use this approach in What do you reckon? |
On that note, maybe |
|
Agreed, moving instead of removing is a better solution. Any preference on
Yes.
This would require handling the byte order mark (see here). Any reason not to decode as utf8? If so, it might be better to fall back to binary detection (8b5d8d0) which also does not have the problem of removing scripts / text files that might miss a shebang.
Not sure I understand, are you referring to moving objects instead of deleting or replacing |
I didn't know about that, good catch. However, we don't use it, and I've just tried inserting it at the beginning of the verified that it inserts the right character (0xEF 0xBB 0xBF) using looks like it's reading pairs of bytes back to front, but it checks out: But So to ensure compatibility at least with MSYS2, it looks like we can't use the BOM. Besides, from that same wikipedia page:
Regarding the What do you reckon? |
… of deleting them
|
This version now moves files out of This binary file detection requires knowing the |
|
Open question: If one runs |
|
Bumping and adding "question" tag; I tested & was happy to merge save for the question in the prior comment. |
|
No strong preference but I'd suggest keeping the contents of |
Addresses #1042: remove binary files in the target bin folder if no corresponding
.cppfile can be found. The check whether a file is binary should be conservative butcheck_encodingmakes it a rather expensive operation and could be relaxed if need be.Removal is triggered for every build, irrespective of the build target.