◐ Shell
clean mode source ↗

[WIP] GH-65238: Preserve trailing slash in pathlib by barneygale · Pull Request #112363 · python/cpython

Fix the last known case where pathlib can mangle the meaning of a path. This brings pathlib in line with IEEE Std 1003.1-2017, where trailing slashes are meaningful to path resolution and should not be discarded.

Changes

In several important respects, paths with trailing slashes behave differently to their slash-less counterparts:

  • Paths with and without trailing slashes compare unequal and generate different hash codes
  • __str__(), __fspath__() and related representations include any trailing slash
  • glob() patterns ending with a slash will now generate results ending with a slash, matching glob.glob() behaviour
  • match() now observes trailing slashes, and so its pattern language exactly matches that of glob().

To manipulate a trailing slash, we add these new methods/properties:

  • has_trailing_sep - read-only boolean indicating whether a trailing slash is present
  • with_trailing_sep() - returns a new path with a trailing slash present
  • without_trailing_sep() - returns a new path with a trailing slash omitted

Backwards compatibility

Empty segments given to the PurePath initialiser do not generate new segments, so str(PurePath("foo", "")) results in "foo", not "foo/".

Methods concerned with dirnames and basenames ignore any trailing slash:

  • name, stem, suffix and suffixes retrieve the last non-empty path segment, and so any trailing slash are ignored
  • with_name(), with_stem() and with_suffix() replace the last non-empty path segment
  • parent and parents ignore any trailing slash
  • relative_to() and is_relative_to() ignore any trailing slashes in self and other.
  • The .parts tuple works exactly as before, and doesn't distinguish paths with trailing separators
    • Not sure if this is right tbh

Dependencies

Future work

Once this lands, we can take advantage of the fact that pathlib does not mangle paths. Thus: