bpo-42972: Fully implement GC protocol for functools LRU cache#26423
Conversation
vstinner
left a comment
There was a problem hiding this comment.
I'm not sure if it's a good idea to "expose" the link type in this traverse function. The only purpose would be to be able to break reference cycles involving multiple "hidden" objects: the LRU cache itself (which is not directly accessible in Python, you need to dig into gc.get_objects()) and the link type (again, you need gc.get_objects() since @methane removed it from the module dict, which is a good thing).
I'm not strongly against it, technically, it respects the GC: the clear function clears the linked list and so removes references to the link type.
By the way, making the two LRU heap types (cache and link) immutable can make these reference cycles even less likely. But it should be done in a separated PR.
Sorry, something went wrong.
|
Oh. lru_list_elem_type was modified to not implemented the GC to optimize the code, I missed that: Please explain it in a comment, it's non-obvious. https://bugs.python.org/issue32422 seems to conflict with https://bugs.python.org/issue42972 Maybe it's acceptable to not implement the GC protocol if the link type is hidden well and it is immutable. In my experience, reference cycles are created in very surprising ways. I don't know about this one. I don't know if it's acceptable for now, if it's a trade-off, or if there is a risk of leaks. An alternative would be to avoid completely Python objects in the cache. Replace lru_cache_object.cache Python dictionary with a _Py_hashtable_t. So the linked list items don't have to be Python objects anymore. But this change would be a major change and should only be done in the main branch. The question is what to do in the 3.10 branch:
|
Sorry, something went wrong.
|
Thanks @erlend-aasland for the PR, and @vstinner for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10. |
Sorry, something went wrong.
…nGH-26423) (cherry picked from commit 3f8d332) Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
|
I dislike the change, but I merged it anyway to unblock https://bugs.python.org/issue42972 since @pablogsal marked it as a release blocker for the beta2. Moreover, at least, the change makes the code less broken, since it now visits the LRU cache type which is a bugfix. We can decide what to do with the link type after the beta2. |
Sorry, something went wrong.
|
@vstinner: I can add explanatory comments in a new PR, if you want. Also, let me know if you want a PR for making the cache and cache link types immutable. |
Sorry, something went wrong.
|
I am a bit confused about what happened here, the PR mentions that "fully implement the GC protocol" but the change just moves some calls and visits the type.... |
Sorry, something went wrong.
The LRU cache type already had the Py_TPFLAGS_HAVE_GC flag, and already had a traverse method. The only thing missing was visiting the type. According to bpo-42972, the LRU cache now fully implements the GC protocol, no? |
Sorry, something went wrong.
) (cherry picked from commit 3f8d332) Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
Is there any regression in performance with this or other related changes for the cache? I am asking basically based on @methane previous - efforts to optimize this. |
Sorry, something went wrong.
@methane's optimisations were in the end applied (lru cache link type is visited via the lru cache type traversal slot). I haven't done any benchmarking yet. |
Sorry, something went wrong.
https://bugs.python.org/issue42972