Issue 33342: urllib IPv6 parsing fails with special characters in passwords
Issue33342
Created on 2018-04-23 13:44 by benaryorg, last changed 2022-04-11 14:58 by admin.
| Messages (7) | |||
|---|---|---|---|
| msg315668 - (view) | Author: benaryorg (benaryorg) | Date: 2018-04-23 13:44 | |
The documentation specifies to follow RFC 2396 (https://tools.ietf.org/html/rfc2396.html) but fails to parse a user:password@host url in urllib.parse.urlsplit (https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlsplit) when the password contains an '[' character. This is because the urlsplit code does not strip the authority part (everything from index 0 up to and including the last '@') before checking whether the hostname contains '[' for detecting whether it's an IPv6 address (https://github.com/python/cpython/blob/8a6f4b4bba950fb8eead1b176c58202d773f2f70/Lib/urllib/parse.py#L416-L418). |
|||
| msg317119 - (view) | Author: Martin Panter (martin.panter) * ![]() |
Date: 2018-05-19 13:49 | |
I presume this is about parsing a URL like
>>> urlsplit("//user:[@host")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/proj/python/cpython/Lib/urllib/parse.py", line 431, in urlsplit
raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL
Ideally the square bracket should be escaped as %5B. Related reports about parsing unescaped delimiters in a URL password are Issue 18140 (fragment #, query ?) and Issue 23328 (slash /).
|
|||
| msg327239 - (view) | Author: Thomas Jollans (tjollans) | Date: 2018-10-06 09:43 | |
RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 <https://www.ietf.org/rfc/rfc2732.txt> defines the syntax for IPv6 URLs, and allows [ and ] ONLY in the host part. So I'd say that the behaviour is arguably correct (if somewhat unfortunate) |
|||
| msg334273 - (view) | Author: Terrence Brannon (metaperl) | Date: 2019-01-23 21:37 | |
I would like to add to this bug - the password field on the URL cannot contain a pound sign or question mark or the parser incorrectly parses the URL, as this gist demonstrates - https://gist.github.com/metaperl/fc6f43bf6b9a9f874b8f27e29695e68c |
|||
| msg334302 - (view) | Author: Terrence Brannon (metaperl) | Date: 2019-01-24 15:55 | |
Also note, if SQLAlchemy gives any guidance, then note that SA unquotes both the username and password of the URL: https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/engine/url.py#L274 |
|||
| msg334303 - (view) | Author: Terrence Brannon (metaperl) | Date: 2019-01-24 15:59 | |
Regarding "RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 <https://www.ietf.org/rfc/rfc2732.txt> defines the syntax for IPv6 URLs, and allows [ and ] ONLY in the host part. So I'd say that the behaviour is arguably correct (if somewhat unfortunate)" I would say that a square bracket CAN be used in the password, but that it should be urlencoded and that this library should perform a urldecode for both username and password, just as SQLAlchemy does. |
|||
| msg354745 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2019-10-15 17:08 | |
I modified my PR 16780 to also fix this issue, my PR was written for bpo-36338. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:59 | admin | set | github: 77523 |
| 2019-10-15 17:08:55 | vstinner | set | messages: + msg354745 |
| 2019-10-15 16:24:08 | xtreak | set | nosy:
+ vstinner |
| 2019-01-24 15:59:49 | metaperl | set | messages: + msg334303 |
| 2019-01-24 15:55:55 | metaperl | set | messages: + msg334302 |
| 2019-01-23 21:37:03 | metaperl | set | nosy:
+ metaperl messages: + msg334273 |
| 2018-10-06 09:43:45 | tjollans | set | nosy:
+ tjollans messages: + msg327239 |
| 2018-05-19 13:49:36 | martin.panter | set | nosy:
+ martin.panter messages: + msg317119 |
| 2018-04-23 13:44:45 | benaryorg | set | type: behavior |
| 2018-04-23 13:44:30 | benaryorg | create | |
