-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Tar: Write unused GNU tar entry size, ctime and atime bytes in the expected format #114868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…eld should be filled with "0" characters. The bug was that we were filling them with nulls like in the other formats. Added sync and async tests to validate directly in the memory stream that the unused bytes in the size field are indeed written as "0" characters.
…is format) need to be written as null characters if their value is the Unix Epoch. The bug was that we were writing them as "0" characters as other GNU metadata fields. This was causing tools like 7-zip to show an extra unexpected folder with a name formed by just "0" characters. Other tools were unable to read these archives as they were considered corrupt. Added sync and async tests to validate directly in the memory stream that when atime and ctime represent the Unix Epoch, they are indeed written as null characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes issues with GNU tar entry metadata where atime and ctime fields are incorrectly written as "0" characters instead of null characters, and ensures that unused bytes in the size field are correctly set to "0" characters. Key changes include adding sync and async tests for these behaviors and adjusting header writing logic to correctly format these fields.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/libraries/System.Formats.Tar/tests/TarEntry/GnuTarEntry.Tests.cs | Added tests to verify unused size field bytes and correct handling of Unix Epoch timestamps |
| src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHeader.Write.cs | Modified header writing logic to remove unnecessary _dataStream check and return 0 for UnixEpoch timestamps |
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHeader.Write.cs
Show resolved
Hide resolved
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHeader.Write.cs
Show resolved
Hide resolved
|
Tagging subscribers to this area: @dotnet/area-system-formats-tar |
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHeader.Write.cs
Show resolved
Hide resolved
|
Need to adjust the checksum unit tests, as the expected values are wrong now. |
…egular timestamps (mtime) should not be affected by this. Modify all the sync and async checksum shared code so that they reflect the changes in the expected values of size, atime and ctime.
|
I pushed an extra commit to separate the new timestamp checking logic from the old method and created another one just for the gnu timestamps. I also fixed all the checksum unit tests, sync and async, since the expected values for gnu were now different. I also separated the gnu checksum verification tests into their own separate ones since they need to additionaly verify the epoch values (no other formats will have this issue). |
This PR fixes two bugs in the GNU format:
The atime and ctime fields (unique to this format) need to be written as null characters if their value is the Unix Epoch. The bug was that we were writing them as "0" characters as other GNU metadata fields. This was causing tools like 7-zip to show an extra unexpected folder with a name formed by just "0" characters. Other tools were unable to read these archives as they were considered corrupt.
Added sync and async tests to validate directly in the memory stream that when atime and ctime represent the Unix Epoch, they are indeed written as null characters.
The unused bytes of the size metadata field should be filled with "0" characters. The bug was that we were filling them with nulls like in the other formats.
Added sync and async tests to validate directly in the memory stream that the unused bytes in the size field are indeed written as "0" characters.
Added sync and async tests to validate directly in the memory stream that when atime and ctime represent the Unix Epoch, they are indeed written as null characters.
Manual test: reading entries in a GNU archive generated by another tool, then rewriting each entry into a new archive created with TarWriter. Then opening it with 7-zip.
Before - Note how 7-zip did its best to deduce the minimum required fields, but most was empty:

After - Note that 7-zip now shows many other fields that it was unable to read before:
