[WIP] GH-73991: Add `os.copy()` and friends by barneygale · Pull Request #119079 · python/cpython
@zooba, I just noticed that the base system function BaseCopyStream() was updated to use the new system call NtCopyFileChunk(). This system call implements copying part of a file in the kernel, without needing separate read and write system calls. BaseCopyStream() is used by the CopyFile*() API functions.
This makes it more tempting to implement copyfile() with CopyFile2(). The problem is that copyfile() shouldn't copy timestamps, file attributes, extended attributes, or named data streams. The flag COPY_FILE_SKIP_ALTERNATE_STREAMS skips copying name data streams. Copying the last-write timestamp and file attributes is unavoidable, but it could be worked around as follows:
- Save the original file attributes from
lstat()orstat()if the destination file already exists, or else useFILE_ATTRIBUTE_NORMAL. - Restore or reset the file attributes via
SetFileAttributesW(). - Set the last-access and last-write timestamps to the current time via
os.utime(). This also updates the file's change timestamp, if it has one (e.g. NTFS, but not FAT32). - Or combine the attribute and timestamp update in a single function that calls
CreateFileW()andSetFileInformationByHandle():FileBasicInfo.
The presence of extended attributes could be ignored, I suppose, since the Windows API provides no way to query or explicitly set them. In practice, probably the most common extended attributes are in the reserved "$KERNEL.*" namespace. Those can't be copied or set from user mode. Other extended attributes will be copied, such as the "$CI.CATALOGHINT" extended attribute that Microsoft sets on distributed files.
As discussed in past issues, copytree() and move() should support copying directory metadata and data streams on Windows. copytree() would call os.makedirs() to create the parent directories of dst. The destination directory would be created by copying from src using WinAPI CreateDirectoryExW() or, if that fails because dst exists, fall back on CopyFile2().
CreateDirectoryExW() supports copying the following from the source directory: the case-sensitive flag (crucial to reliably copying a case-sensitive directory), file attributes, extended attributes, named data streams, and a symlink or junction reparse point. It doesn't allow the target directory to already exist.
CopyFile2() supports directories if passed the flag COPY_FILE_DIRECTORY. It copies the last-write timestamp, file attributes, extended attributes, named data streams, and a junction reparse point. Unlike CreateDirectoryExW(), it doesn't copy the case-sensitive flag. Copying a symlink requires the flag COPY_FILE_COPY_SYMLINK. If the destination directory exists, it must be empty in order to copy a symlink or junction.