[WIP] GH-65238: Preserve trailing slash in pathlib by barneygale · Pull Request #112363 · python/cpython
Fix the last known case where pathlib can mangle the meaning of a path. This brings pathlib in line with IEEE Std 1003.1-2017, where trailing slashes are meaningful to path resolution and should not be discarded.
Changes
In several important respects, paths with trailing slashes behave differently to their slash-less counterparts:
- Paths with and without trailing slashes compare unequal and generate different hash codes
__str__(),__fspath__()and related representations include any trailing slashglob()patterns ending with a slash will now generate results ending with a slash, matchingglob.glob()behaviourmatch()now observes trailing slashes, and so its pattern language exactly matches that ofglob().
To manipulate a trailing slash, we add these new methods/properties:
has_trailing_sep- read-only boolean indicating whether a trailing slash is presentwith_trailing_sep()- returns a new path with a trailing slash presentwithout_trailing_sep()- returns a new path with a trailing slash omitted
Backwards compatibility
Empty segments given to the PurePath initialiser do not generate new segments, so str(PurePath("foo", "")) results in "foo", not "foo/".
Methods concerned with dirnames and basenames ignore any trailing slash:
name,stem,suffixandsuffixesretrieve the last non-empty path segment, and so any trailing slash are ignoredwith_name(),with_stem()andwith_suffix()replace the last non-empty path segmentparentandparentsignore any trailing slashrelative_to()andis_relative_to()ignore any trailing slashes in self and other.- The
.partstuple works exactly as before, and doesn't distinguish paths with trailing separators- Not sure if this is right tbh
Dependencies
- GH-112361: Speed up pathlib by removing some temporary objects. #112362
- GH-106747: Improve
Path.glob()expectations in pathlib tests #112365 - GH-112675: Move path joining tests into
test_posixpathandtest_ntpath#112676 - GH-65238: Add test cases for trailing slash handling in pathlib #113248
Future work
Once this lands, we can take advantage of the fact that pathlib does not mangle paths. Thus:
- We can make
__fspath__()return an unnormalized path for a tasty speed boost in many applications - We can call into
pathlibfromshutil, which means we can add methods likechown(),move()andrmtree()without code duplication