Make filesystem::rename POSIX compliant#2062
Make filesystem::rename POSIX compliant#2062SibiSiddharthan wants to merge 3 commits intomicrosoft:mainfrom
Conversation
|
I have not yet fixed the issue regarding the read-only bit as I needed to clarify something before I get started with that. Looks like POSIX rename does not support cross device renaming and such an operation fails with This is actually tied to whether we can use |
|
I would like to add a few more tests cases to I propose 2 test cases for this:
NOTE: These tests cases don't depend on the changes introduced in this PR. So should I bundle these test cases with this PR itself or raise a separate one? |
adfab0b to
16873c5
Compare
|
Instead of write in your first comment. |
Thanks for the info |
| [[nodiscard]] __std_win_error __stdcall __std_fs_rename( | ||
| _In_z_ const wchar_t* const _Source, _In_z_ const wchar_t* const _Target) noexcept { | ||
| __std_win_error _Last_error; | ||
| constexpr auto _Flags = __std_fs_file_flags::_Backup_semantics | __std_fs_file_flags::_Open_reparse_point; |
There was a problem hiding this comment.
Does __std_fs_file_flags::_Backup_semantics work under a limited user account?
There was a problem hiding this comment.
Not sure, but we need this to rename a directory.
There was a problem hiding this comment.
I don't think MoveFileEx uses this flag internally, as it can use NtOpenFile directly.
My concern is about this part of CreateFileW doc:
FILE_FLAG_BACKUP_SEMANTICS
0x02000000
The file is being opened or created for a backup or restore operation. The system ensures that the calling process overrides file security checks when the process has SE_BACKUP_NAME and SE_RESTORE_NAME privileges. For more information, see Changing Privileges in a Token.
If there's a chance that there's dependency on privilege, the function may behave inconsistently, depending on that.
There was a problem hiding this comment.
Do we have a way to get a HANDLE to a directory without FILE_FLAG_BACKUP_SEMANTICS? I am not sure about using the Nt* stuff. Is it allowed here?
There was a problem hiding this comment.
I don't know. Undocumented stuff is not allowed (#705). But maybe NtOpenFile is fine, there's a documentation https://docs.microsoft.com/en-us/windows/win32/api/winternl/nf-winternl-ntopenfile
Would be good if just using FILE_FLAG_BACKUP_SEMANTICS is fine, but need to confirm it.
There was a problem hiding this comment.
FILE_FLAG_BACKUP_SEMANTICS can be used by limited users; we already use it elsewhere in this implementation. The documentation is somewhat confusing, what it means is that if you have taken appropriate privileges, and you use the flag, then you get the effects of those privileges (ignoring read restrictions with SeBackupPrivilege, and write restrictions with SeRestorePrivilege). If you haven't taken those privileges then you still get the ability to open a directory and other effects of the flag, you just don't get to ignore ACLs.
There was a problem hiding this comment.
I don't know. Undocumented stuff is not allowed (#705). But maybe NtOpenFile is fine, there's a documentation https://docs.microsoft.com/en-us/windows/win32/api/winternl/nf-winternl-ntopenfile
NtOpenFile is not available under the "app"/"onecore" flavors, to my understanding.
| FILE_ID_INFO Id_a, Id_b; | ||
| if (GetFileInformationByHandleEx(_Handle_a, FileIdInfo, &Id_a, sizeof(FILE_ID_INFO)) | ||
| && GetFileInformationByHandleEx(_Handle_b, FileIdInfo, &Id_b, sizeof(FILE_ID_INFO))) { | ||
| return memcmp(Id_a.FileId.Identifier, Id_b.FileId.Identifier, sizeof(FILE_ID_128)) == 0; |
There was a problem hiding this comment.
A question - how would this behave on simpler filesystems, like FAT or exFAT. They don't have hard links, but do they have FILE_ID_INFOs? And if not, would GetFileInformationByHandleEx fail, or successfully return zero/garbage?
Also looks like VolumeSerialNumber should be compared too, there may be different volumes containing files with the same ID
There was a problem hiding this comment.
rename does not work with files across volumes. See https://pubs.opengroup.org/onlinepubs/009604599/functions/rename.html or https://man7.org/linux/man-pages/man2/rename.2.html. So I don't think we should compare VolumeSerialNumber.
With regards to FAT filesystems, I pulled out a USB stick and tried it. Looks like GetFileInformationByHandleEx fails with ERROR_INVALID_PARAMETER when we request FileIdInfo. So no worries 😄
|
|
||
| // Choosing FILE_STANDARD_INFO over FILE_BASIC_INFO to check whether the handles refer to directories. | ||
| // Because sizeof(FILE_STANDARD_INFO) = 24 < sizeof(FILE_BASIC_INFO) = 40. | ||
| // How expensive is this operation? (comment for reviewers) |
There was a problem hiding this comment.
My guess it does not matter much. Though FILE_STANDARD_INFO is more demanded information while working with file, so more likely to be easier to get.
| [[nodiscard]] bool __stdcall _Is_same_file(_In_ HANDLE _Handle_a, _In_ HANDLE _Handle_b) { | ||
| constexpr auto _Flags = __std_fs_file_flags::_Backup_semantics | __std_fs_file_flags::_Open_reparse_point; | ||
| FILE_ID_INFO Id_a, Id_b; | ||
| if (GetFileInformationByHandleEx(_Handle_a, FileIdInfo, &Id_a, sizeof(FILE_ID_INFO)) |
There was a problem hiding this comment.
Note that FILE_ID_INFO is not available on Windows 7. Consider falling back to GetFileInformationByHandle and examining BY_HANDLE_FILE_INFORMATION::nFileIndexLow/High.
There was a problem hiding this comment.
Note that
FILE_ID_INFOis not available on Windows 7. Consider falling back toGetFileInformationByHandleand examiningBY_HANDLE_FILE_INFORMATION::nFileIndexLow/High.
I have a few questions regarding BY_HANDLE_FILE_INFORMATION::nFileIndexLow/High and FILE_ID_INFO::FileId.
According to the docs both of them represent some sort of File Ids. I did some testing with a bunch of random files across different volumes, I found that BY_HANDLE_FILE_INFORMATION::nFileIndexLow/High , FILE_ID_INFO::FileId.Identifier[0..7] are different and FILE_ID_INFO::FileId.Identifier[8..15] is always 0. BY_HANDLE_FILE_INFORMATION::dwVolumeSerialNumber and FILE_ID_INFO::VolumeSerialNumber are also different.
Can anyone explain me why is this?
For checking whether the files are hardlinks I think it is better to stick to BY_HANDLE_FILE_INFORMATION::nFileIndexLow/High for Windows 7 compatibilty.
Regarding the use of __std_fs_get_file_id, I am still awaiting a response to this #2062 (comment) .
There was a problem hiding this comment.
On NTFS file ids are 64-bit values. ReFS has 128-bit file ids. FILE_ID_INFO etc. were introduced to support the expanded file id space needed by ReFS, but on NTFS the ids remain 64-bit values and the upper 64-bits will always be zero. In order to properly support all edge cases on ReFS volumes, it is best to use the 128-bit file id type on systems that support it.
Similarly, the legacy structure has a 32-bit VSN field and the new structure returns the full 64-bit VSN - the different values are due to truncation.
|
Even more so than remove, I believe the most important property of rename is that it is atomic -- it is considerably more important that it be so than remove because rename is the only usually atomic filesystem operation. I think a more appropriate fix would be to document this divergence under http://eel.is/c++draft/fs.conform.9945#2 |
Can it be done atomic by not calling |
If the overall goal is POSIX-compliance, I would suggest |
From the documentation:
Doesn't seem to address empty directory issue |
True. FileRenameInfoEx's POSIX semantics would make it immediately overwrite an existing target file name even if it is currently open, but attempts to overwrite an existing empty directory still fail with |
My plan was to use Windows 7 only supports |
|
I think “use recent Win10 if you care about these cases” is an acceptable outcome, in the same way that remove can’t remove open things on old OSes. To the best of my understanding we can’t call any NtXxx things — it isn’t about whether they’re documented, it’s about whether they will be accepted by the Store, and whether they have apisets (to be callable from OneCore environments) |
Fixed bugs 1) Renaming files which are hard-links of each other should be a no-op. 2) Renaming directories where the new path points to an empty directory should succeed. Added test cases as well for the above.
Changes: Renamed _Same_file -> _Is_same_file Better return way for _Is_same_file
|
Just to butt in, we're seeing an issue with |
Hey, if you refer to this comment #2062 (comment). I have my proposed changes for rename. This does include |
|
I'm not sure we want POSIX semantics, because it means suddenly this function becomes case sensitive. So you can end up in a situation where you have two files with the same name but different cases after rename, which is a pain for the user to resolve (because almost no tooling is case sensitive so they will behave incorrectly upon trying to delete the file - for example not even fs::remove is case sensitive, afaik) |
POSIX semantics does not mean case sensitive. It means that you will have the ability to rename a file even though there are existing handles open to it. See here https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_rename_information |
Yep, that's the idea. From what I can see, the existing implementation of |
There is a big difference though in that rename being atomic is important, particularly since it is the only filesystem operating that is reliably atomic everywhere. |
Hey, @BillyONeal can you clarify whether |
It is, but it would not be with this change. |
Alright, So using |
I believe that's right but I will defer to current maintainers here. |
|
So first, the current implementation's behavior:
My opinion: I am of the same opinion as Billy; making our implementation non-atomic is unacceptable. I'd be willing to take a look at the Marking as draft as the implementation needs to be changed completely. @SibiSiddharthan thank you for your work so far, I really appreciate it! sorry for letting this languish for so long :( |
|
Hey,
I understand your decision.
The Since For older Windows versions as cannot remove the readonly bit and delete the file atomically, we will avoid the This will also address In Summary, newer Windows versions will have a more standard conforming |
|
@SibiSiddharthan disallowing I'm not sure; I'm curious how Billy feels. It makes me uncomfortable, but not overly so. I think I would still prefer to use MoveFileEx if possible, since that's the "obvious" API for |
Yes they do. Please refer this link https://pubs.opengroup.org/onlinepubs/9699919799/functions/rename.html. See error code EDIT: I have gone through the standard again. Looks like |
I believe the C++ standard referencing POSIX here is only because POSIX was an ISO standard, and we didn't want to have to define "what is a file" and every semantic over again from there. http://eel.is/c++draft/fs.conform.9945#2 is the explicit callout to be "get as close to that as is reasonable" and I don't think we should be issuing substantially higher numbers of syscalls or similar to replicate every edge case. That said, in this specific case, MoveFileEx already doesn't allow movement between file systems without Whether we can now that std::filesystem has already been shipping for 5 years or so is something else. It isn't a change I would make but I think it would be a "righteous" one if it is indeed clear that POSIX prohibits it.
Hmmmmm I'm not sure I'm convinced on this one. It says that you might get this error in "you may get these errors" remarks but that doesn't seem like "all implementations must give you this error". As for |
|
I am marking this Thank you, @SibiSiddharthan, for this PR! It's still a good PR even if it is not something we wish to take. |
|
Hey, PS: Sorry for the late reply. Have been quite busy lately. |
|
Thanks again @SibiSiddharthan - we talked about this at the weekly maintainer meeting and everyone concurs with closing this PR and keeping the current atomic behavior. |

Fixed bugs
should succeed.
Added test cases as well for the above.
Fixes #2053