bpo-46337: Urllib.parse scheme-specific behavior without reliance on URL scheme by oldaccountdeadname · Pull Request #30520 · python/cpython
Some features in urllib are dependent on schemes, (i.e., preserving the netloc in url joining). Prior to this patch, this was governed by the uses_* lists (uses_relative, uses_netloc, uses_params) which hard code these attributes for certain schemes. Providing an enum interface and a 'constructor' that allows overrides makes this mechanism a bit more flexible for future modifications.
This allows the callers of urljoin and urlparse to add guaranteed scheme classes to the url regardless of the actual scheme, which may not be in the default uses_* lists of schemes. This call-time behavior is done through an optional parameter that preserves backwards compatibility. A test case is added for this, and requires the change present in test_urlparse.checkJoin.
urljoin will not treat `..` as moving up one directory rather than moving up one file, thus causing the doctests to fail due to a missing trailing slash. Both changes are of the form: http://example.org/post/x -> http://example.org/post/x/ Additionally, the my-protocol example's expected output had the wrong scheme.
Minor style things: + _scheme_classes' docstring's summary was made explicit. + _scheme_classes was prepended with and followed by two newlines.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters