Issue 22347: mimetypes.guess_type("//example.com") misinterprets host name as file name
Created on 2014-09-06 02:52 by martin.panter, last changed 2022-04-11 14:58 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| mimetypes-host.patch | martin.panter, 2015-02-24 05:53 | review | ||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 15522 | merged | corona10, 2019-08-26 16:04 | |
| PR 15685 | merged | miss-islington, 2019-09-05 00:34 | |
| PR 15687 | merged | corona10, 2019-09-05 00:49 | |
| PR 16724 | merged | maxking, 2019-10-12 00:42 | |
| PR 16725 | closed | miss-islington, 2019-10-12 05:41 | |
| PR 16727 | merged | maxking, 2019-10-12 15:52 | |
| PR 16728 | merged | maxking, 2019-10-12 16:04 | |
| Messages (15) | |||
|---|---|---|---|
| msg226467 - (view) | Author: Martin Panter (martin.panter) * ![]() |
Date: 2014-09-06 02:52 | |
The documentation says that guess_type() takes a URL, but:
>>> mimetypes.guess_type("http://example.com")
('application/x-msdownload', None)
I suspect the MS download is a reference to *.com files (like DOS's command.com). My current workaround is to strip out the host name from the URL, since I cannot imagine it would be useful for determining the content type. I am also stripping the fragment part. An argument could probably be made for stripping the “;parameters” and “?query” parts as well.
>>> # Workaround for mimetypes.guess_type("//example.com")
... # interpreting host name as file name
... url = urlparse("http://example.com")
>>> url = net.url_replace(url, netloc="", fragment="")
>>> url
'http://'
>>> mimetypes.guess_type(url, strict=False)
(None, None)
|
|||
| msg236479 - (view) | Author: Martin Panter (martin.panter) * ![]() |
Date: 2015-02-24 05:53 | |
Posting a patch to fix this. It passes the URL through a urlsplit() → urlunsplit() stage, while removing the scheme://netloc parts. |
|||
| msg335123 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2019-02-09 02:15 | |
The proposed patch I mentioned on bpo-35939 also solve the above situation. Python 3.8.0a1+ (heads/bpo-12317:96d37dbcd2, Feb 8 2019, 12:03:40) [Clang 9.1.0 (clang-902.0.39.1)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import mimetypes >>> mimetypes.guess_type("http://example.com") (None, None) >>> mimetypes.guess_type("example.com") ('application/x-msdownload', None) >>> I've also added the unit tests of mimetypes-host.patch. It works well. I think that we close this issue also when the bpo-35939 is closed. Thanks alot! |
|||
| msg351156 - (view) | Author: miss-islington (miss-islington) | Date: 2019-09-05 00:34 | |
New changeset 87bd2071c756188b6cd577889fb1682831142ceb by Miss Islington (bot) (Dong-hee Na) in branch 'master': bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522) https://github.com/python/cpython/commit/87bd2071c756188b6cd577889fb1682831142ceb |
|||
| msg351157 - (view) | Author: miss-islington (miss-islington) | Date: 2019-09-05 00:55 | |
New changeset 6d7a786d2e4b48a6b50614e042ace9ff996f0238 by Miss Islington (bot) in branch '3.8': bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522) https://github.com/python/cpython/commit/6d7a786d2e4b48a6b50614e042ace9ff996f0238 |
|||
| msg351158 - (view) | Author: miss-islington (miss-islington) | Date: 2019-09-05 01:16 | |
New changeset 8873bff2871078e9f23e6c7d942d3a8edbd0921f by Miss Islington (bot) (Dong-hee Na) in branch '3.7': [3.7] bpo-22347: Update mimetypes.guess_type to allow proper parsing of URLs (GH-15522) (GH-15687) https://github.com/python/cpython/commit/8873bff2871078e9f23e6c7d942d3a8edbd0921f |
|||
| msg351162 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2019-09-05 01:26 | |
@vstinner(my mentor) @maxking Now this issue is solved. I'd like to close this issue. Is it okay? |
|||
| msg351164 - (view) | Author: Abhilash Raj (maxking) * ![]() |
Date: 2019-09-05 01:29 | |
I think so, yes. Also, while you are at it, can you also close bpo-35939 with a comment that points to this issue and the right PR for the fix? |
|||
| msg351167 - (view) | Author: Dong-hee Na (corona10) * ![]() |
Date: 2019-09-05 01:34 | |
Great! I will close bpo-35939 also. |
|||
| msg354471 - (view) | Author: Ned Deily (ned.deily) * ![]() |
Date: 2019-10-11 17:18 | |
This change introduces a potential 3.7 regression; see Issue38449. |
|||
| msg354521 - (view) | Author: miss-islington (miss-islington) | Date: 2019-10-12 05:41 | |
New changeset 19a3d873005e5730eeabdc394c961e93f2ec02f0 by Miss Islington (bot) (Abhilash Raj) in branch 'master': bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs (GH-15522)" (GH-16724) https://github.com/python/cpython/commit/19a3d873005e5730eeabdc394c961e93f2ec02f0 |
|||
| msg354535 - (view) | Author: Abhilash Raj (maxking) * ![]() |
Date: 2019-10-12 16:30 | |
I am going to re-open this since the fixes were reverted in all the branches. |
|||
| msg354538 - (view) | Author: Abhilash Raj (maxking) * ![]() |
Date: 2019-10-12 16:58 | |
New changeset 5a638a805503131f4a9cc2bbc5944611295c1500 by Abhilash Raj in branch '3.8': [3.8] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs" (GH-16724) (GH-16728) https://github.com/python/cpython/commit/5a638a805503131f4a9cc2bbc5944611295c1500 |
|||
| msg354543 - (view) | Author: miss-islington (miss-islington) | Date: 2019-10-12 18:50 | |
New changeset 164bee296ab1f87cc05566b39ee8fb9fb64b3e5a by Miss Islington (bot) (Abhilash Raj) in branch '3.7': [3.7] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs (GH-15685)" (GH-16724) (GH-16727) https://github.com/python/cpython/commit/164bee296ab1f87cc05566b39ee8fb9fb64b3e5a |
|||
| msg354697 - (view) | Author: Ned Deily (ned.deily) * ![]() |
Date: 2019-10-15 07:30 | |
New changeset 2a405598bbccbc42710dc5ecf3d44c8de4c16582 by Ned Deily (Abhilash Raj) in branch '3.7': [3.7] bpo-38449: Revert "bpo-22347: Update mimetypes.guess_type to allow oper parsing of URLs (GH-15685)" (GH-16724) (GH-16727) https://github.com/python/cpython/commit/2a405598bbccbc42710dc5ecf3d44c8de4c16582 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:07 | admin | set | github: 66543 |
| 2019-10-15 07:30:25 | ned.deily | set | messages: + msg354697 |
| 2019-10-14 12:44:20 | vstinner | set | nosy:
- vstinner |
| 2019-10-12 18:50:06 | miss-islington | set | messages: + msg354543 |
| 2019-10-12 16:58:15 | maxking | set | messages: + msg354538 |
| 2019-10-12 16:30:21 | maxking | set | status: closed -> open messages: + msg354535 |
| 2019-10-12 16:04:59 | maxking | set | pull_requests: + pull_request16310 |
| 2019-10-12 15:52:49 | maxking | set | pull_requests: + pull_request16307 |
| 2019-10-12 05:41:53 | miss-islington | set | pull_requests: + pull_request16305 |
| 2019-10-12 05:41:50 | miss-islington | set | messages: + msg354521 |
| 2019-10-12 00:42:58 | maxking | set | pull_requests: + pull_request16303 |
| 2019-10-11 17:18:38 | ned.deily | set | nosy:
+ ned.deily messages: + msg354471 |
| 2019-09-05 12:19:43 | corona10 | link | issue35939 superseder |
| 2019-09-05 01:44:00 | corona10 | set | status: open -> closed resolution: fixed |
| 2019-09-05 01:34:56 | corona10 | set | stage: patch review -> resolved |
| 2019-09-05 01:34:34 | corona10 | set | messages: + msg351167 |
| 2019-09-05 01:29:06 | maxking | set | messages: + msg351164 |
| 2019-09-05 01:26:52 | corona10 | set | nosy:
+ vstinner, maxking messages: + msg351162 |
| 2019-09-05 01:16:41 | miss-islington | set | messages: + msg351158 |
| 2019-09-05 00:55:01 | miss-islington | set | messages: + msg351157 |
| 2019-09-05 00:49:06 | corona10 | set | pull_requests: + pull_request15345 |
| 2019-09-05 00:34:48 | miss-islington | set | pull_requests: + pull_request15343 |
| 2019-09-05 00:34:39 | miss-islington | set | nosy:
+ miss-islington messages: + msg351156 |
| 2019-08-26 16:04:37 | corona10 | set | stage: patch review pull_requests: + pull_request15206 |
| 2019-02-09 02:19:48 | corona10 | set | versions: + Python 3.7, Python 3.8, - Python 3.4 |
| 2019-02-09 02:15:26 | corona10 | set | nosy:
+ corona10 messages: + msg335123 |
| 2019-02-08 23:25:29 | martin.panter | set | dependencies: + Remove urllib.parse._splittype from mimetypes.guess_type |
| 2015-02-24 05:53:55 | martin.panter | set | files:
+ mimetypes-host.patch keywords: + patch messages: + msg236479 |
| 2014-09-06 02:52:37 | martin.panter | create | |
