Issue 39783: Optimize construction of Path from other Paths by just returning the same object?
Issue39783
Created on 2020-02-28 11:36 by Antony.Lee, last changed 2022-04-11 14:59 by admin.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 20288 | open | remi.lapeyre, 2020-05-21 12:00 | |
| Messages (6) | |||
|---|---|---|---|
| msg362874 - (view) | Author: Antony Lee (Antony.Lee) * | Date: 2020-02-28 11:36 | |
Many functions which take a path-like object typically also accept strings (sorry, no hard numbers here). This means that if the function plans to call Path methods on the object, it needs to first call Path() on the arguments to convert them, well, to Paths. This adds an unnecessary cost in the case where the argument is *already* a Path object (which should become more and more common as the use of pathlib spreads), as Path instantiation is not exactly cheap (it's on the order of microseconds).
Instead, given that Paths are immutable, `Path(path)` could just return the exact same path instance, completely bypassing instance creation (after checking that the argument's type exactly matches whatever we need and is not, say, PureWindowsPath when we want to instantiate a PosixPath, etc.). Note that there is prior art for doing so in CPython: creating a frozenset from another frozenset just returns the same instance:
```
In [1]: s = frozenset({1}); id(s) == id(frozenset(s)) == id(s.copy())
Out[1]: True
```
|
|||
| msg362884 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2020-02-28 16:07 | |
Possibly this could be done just by adding @lru_cache to the __new__() method. |
|||
| msg362886 - (view) | Author: Rémi Lapeyre (remi.lapeyre) * | Date: 2020-02-28 16:23 | |
There is the _closed attribute thought that is mutated by some methods. |
|||
| msg362887 - (view) | Author: Antony Lee (Antony.Lee) * | Date: 2020-02-28 17:43 | |
Decorating __new__ with lru_cache would likely run into memory leakage problems? (one would need a "weak lru_cache", I guess). I didn't know about the _closed attribute. From a quick look it appears to only be settable by using the path (not the actual file) as a context manager, and only serves to block further filesystem methods, but I'm not even sure why? (for example, it doesn't block fspath, so it won't prevent operations via os.path functions anyways...) |
|||
| msg362919 - (view) | Author: Rémi Lapeyre (remi.lapeyre) * | Date: 2020-02-28 21:41 | |
> Decorating __new__ with lru_cache would likely run into memory leakage problems? I think the LRU cache would be for returning the same instance when called with the same string. I don't think it would be needed to return the same instance when called with a Path instance. > From a quick look it appears to only be settable by using the path (not the actual file) as a context manager, and only serves to block further filesystem methods, but I'm not even sure why? This came up in bpo-39682 recently, I'm not sure what this flag is supposed to mean and I could not find it in the documentation. Since it's uncorrelated to the file object and the physical file, I think it may actually be dangerous to rely on this flag. I don't know why is the purpose of using Path as a context manager. |
|||
| msg362940 - (view) | Author: Barney Gale (barneygale) * | Date: 2020-02-28 23:23 | |
Attempted fix shown here: https://github.com/barneygale/cpython/commit/784630ef6ad05031abdefa523e61e0629b15e201 Note that I've already removed context manager support (and hence `_closed`) in this branch. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:27 | admin | set | github: 83964 |
| 2020-06-21 09:44:05 | Antony.Lee | set | nosy:
- Antony.Lee |
| 2020-06-21 00:46:22 | remi.lapeyre | set | versions: + Python 3.10, - Python 3.9 |
| 2020-05-21 12:00:05 | remi.lapeyre | set | keywords:
+ patch stage: patch review pull_requests: + pull_request19563 |
| 2020-02-28 23:23:32 | barneygale | set | nosy:
+ barneygale messages: + msg362940 |
| 2020-02-28 21:41:52 | remi.lapeyre | set | messages: + msg362919 |
| 2020-02-28 17:43:18 | Antony.Lee | set | messages: + msg362887 |
| 2020-02-28 16:23:17 | remi.lapeyre | set | nosy:
+ remi.lapeyre messages: + msg362886 |
| 2020-02-28 16:07:14 | rhettinger | set | nosy:
+ rhettinger messages: + msg362884 |
| 2020-02-28 14:18:38 | xtreak | set | nosy:
+ brett.cannon, pitrou |
| 2020-02-28 11:36:37 | Antony.Lee | create | |
