bpo-36338: urllib.urlparse rejects invalid IPv6 addresses#16780
bpo-36338: urllib.urlparse rejects invalid IPv6 addresses#16780vstinner wants to merge 1 commit into
Conversation
|
@corona10: I wrote the scope test differently for make it more readable. |
Sorry, something went wrong.
corona10
left a comment
There was a problem hiding this comment.
Sorry, something went wrong.
|
@vadmium, @zooba, @tirkarthi: Would you mind to review this change? What do you think of my approach using ipaddress and excluding some characters from the IPv6 scope ("zone identifier")? |
Sorry, something went wrong.
|
I modified my PR to also fix https://bugs.python.org/issue33342 : "urllib IPv6 parsing fails with special characters in passwords". |
Sorry, something went wrong.
|
@orsenthil: Would you mind to review this change? Any idea for the allowed characters in an IPv6 scope? |
Sorry, something went wrong.
|
@vstinner - Sure, I will review. I will refer to the RFC for the valid characters for the IPv6 scope. |
Sorry, something went wrong.
|
I tried to allow [ and ] in the user:password part, but then the URL parser is confused by the URL: It reads it as IPv6 |
Sorry, something went wrong.
|
I rebased my PR to fix the merge conflict. @orsenthil: Ping for review. |
Sorry, something went wrong.
|
Firefox doesn't seem to accept % in the IPv6 part of an URL. When I type the following URL, it opens Google with the URL as a search... |
Sorry, something went wrong.
Same behavior in Chromium. |
Sorry, something went wrong.
orsenthil
left a comment
There was a problem hiding this comment.
LGTM. Thank you.
Sorry, something went wrong.
Done. Thank you, Victor. |
Sorry, something went wrong.
I expected these browsers to have percent-encode these and work. https://bugzilla.mozilla.org/show_bug.cgi?id=700999 Also, "Microsoft Edge (as well as Microsoft Explorer) works well with link local IPV6 addresses." This is the most relevant information I found https://en.wikipedia.org/wiki/IPv6_address#Use_of_zone_indices_in_URIs |
Sorry, something went wrong.
|
The living URL Standard doesn't implement IPv6 scope on purpose:
This comment points to https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2 which is a comment written by Ryan Sleevi at 2015-08-14:
Firefox feature request https://bugzilla.mozilla.org/show_bug.cgi?id=700999 has been rejected using this comment as well at 2015-08-14. Currently, only Microsoft Edge supports IPv6 scope: Firefox and Chromium don't. I suggest to follow Firefox, Chromium and living URL Standard example: don't support IPv6 scope. My current implementation doesn't implement the RFC 6874 which suggests to use |
Sorry, something went wrong.
This makes me even more uncomfortable to support IPv6 scope: it is not well defined if urlsplit() is expected to be used on a quote or unquoted URL. This is a major difference for RFC 6874 which is tied to quoted characters. Not well defined means: we should not have to dig into tests to reverse engineer the "expected" function behavior. It should be well documented and well tested. I mean that if someone wants to support IPv6 scope in URL, I suggest to first clarify what urlsplit() expects. IMHO fixing this is out of the scope of fixing https://bugs.python.org/issue36338 security vulnerability. |
Sorry, something went wrong.
|
I failed finding time to finish the PR. I prefer to abandon it. |
Sorry, something went wrong.
The urllib.urlparse module now rejects invalid IPv6 addresses and
invalid port numbers when parsing an URL.
https://bugs.python.org/issue36338