GH-81079: Add case_sensitive argument to `pathlib.Path.glob()` by barneygale · Pull Request #102710 · python/cpython
This argument allows case-sensitive matching to be enabled on Windows, and case-insensitive matching to be enabled on Posix.
This PR removes _PreciseSelector and uses _WildcardSelector to select non-wildcard patterns. This allows case sensitivity rules to be varied, and ensures the paths returned from glob() use filesystem casing.
This argument allows case-sensitive matching to be enabled on Windows, and case-insensitive matching to be enabled on Posix.
barneygale
changed the title
GH-81079: Add case_sensitive argument to pathlib.Path.glob()
GH-81079: Add case_sensitive argument to pathlib.Path.glob()
It might be nice to describe the =None value as meaning "using the native case comparison rules for the filesystem". Windows allows marking directories as case-sensitive, and one day we might actually respect that. Similarly, pretty sure you can use FAT from any OS, and we ought to treat that as case insensitive by default.
We don't have to implement those yet, but let's not lock ourselves out by documenting the specific rules. We can also file a feature request now to implement proper case support - it might help find someone motivated to make it happen.
Also, like with the symlink change, I'd like to see clearly stated which parts of the pattern are affected, as I assume it only applies to segments from the first wildcard (that is, foo/bar*/baz with case_sensitive=False on a case sensitive drive matches foo/BARF/BAZ but not FOO/barf/baz).
I trust you (and the tests) on the implementation. It takes me a good few hours to get this code into my head in a way I can judge it, and I don't have time for that today, but also don't want to block on it when all the tests look good :)
Thanks for taking a look!
It might be nice to describe the
=Nonevalue as meaning "using the native case comparison rules for the filesystem". Windows allows marking directories as case-sensitive, and one day we might actually respect that. Similarly, pretty sure you can use FAT from any OS, and we ought to treat that as case insensitive by default.We don't have to implement those yet, but let's not lock ourselves out by documenting the specific rules. We can also file a feature request now to implement proper case support - it might help find someone motivated to make it happen.
Good shout, will do!
Also, like with the symlink change, I'd like to see clearly stated which parts of the pattern are affected, as I assume it only applies to segments from the first wildcard (that is,
foo/bar*/bazwithcase_sensitive=Falseon a case sensitive drive matchesfoo/BARF/BAZbut notFOO/barf/baz).
It affects all parts of the pattern! So it would match both your examples. I think this is the behaviour users expect when passing case_sensitive=False.
(The same thing will apply when we add support for follow_symlinks=True and follow_symlinks=False -- it will affect all parts of the pattern uniformly)
I trust you (and the tests) on the implementation. It takes me a good few hours to get this code into my head in a way I can judge it, and I don't have time for that today, but also don't want to block on it when all the tests look good :)
I'll expand the PR description to (hopefully) make it easier for you (or anyone else) to get into the patch.
It affects all parts of the pattern! So it would match both your examples. I think this is the behaviour users expect when passing
case_sensitive=False.(The same thing will apply when we add support for
follow_symlinks=Trueandfollow_symlinks=False-- it will affect all parts of the pattern uniformly)
I was thinking of the =None case, especially since for case_sensitive that's a legitimate and arguably more useful value (whereas for symlinks it's deprecated/legacy). But applying to the whole pattern, even if the first few parts are constants, seems fine.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ![]()