bpo-42213: Remove redundant cyclic GC hack in sqlite3#26462
bpo-42213: Remove redundant cyclic GC hack in sqlite3#26462erlend-aasland wants to merge 13 commits into
Conversation
|
Pablo/Victor: Whenever you have time 🙏🏻 |
Sorry, something went wrong.
|
I'm not sure if this should be backported to 3.10. It's just a cleanup, so I guess it's fine to backport it, but on the other hand there is no harm in not backporting it. I have no strong preference here. |
Sorry, something went wrong.
|
The Windows failures looks very similar to the GC issues in #24203. https://github.com/python/cpython/pull/26462/checks?check_run_id=2709446243 |
Sorry, something went wrong.
|
Adding 209196eaaa44d02551dad00516ff32fb447c1d8f fixes the test issues on Windows. @vstinner, what do you make of this? |
Sorry, something went wrong.
The sqlite3 module now fully implements the GC protocol, so there's no need for this workaround anymore.
209196e to
70511f3
Compare
June 1, 2021 10:55
|
FYI, rebased onto |
Sorry, something went wrong.
|
Did you run |
Sorry, something went wrong.
|
Windows (x64) CI job failed: |
Sorry, something went wrong.
|
Ah, there is also a warning on Windows (same CI job): |
Sorry, something went wrong.
Yes, I did :) |
Sorry, something went wrong.
Yes, this is the reason for the draft/WIP status. On Windows, the GC uses a lot more time to break the ref. cycle between the statement cache and the connection. From what I can see, it seems that it just loops longer; there's a load of traverse calls before the cycle is broken. On Mac and Linux, the cycle is broken much faster. This results in the objects living slightly longer on Windows, so sqlite3 is clinging on to |
Sorry, something went wrong.
|
In general, I dislike relying on implicit resource management. I would prefer to emit a ResourceWarning if a resource is not released explicitly (by calling a close() method for example). Maybe some tests should call the close() method explicitly? |
Sorry, something went wrong.
That's the workaround I've added in #24203: see d2078bc. It's exactly the same problem: The LRU statement cache ( Previously (or should I say currently), |
Sorry, something went wrong.
|
Keeping Python objects in memory is fine. What is not fine is to hold an OS release, like a file. Do you mean that calling close() does not close the file on disk? |
Sorry, something went wrong.
It does close it, but not until GC is done. |
Sorry, something went wrong.
|
|
Sorry, something went wrong.
The connection object is decref'ed after
My computer time today is very fragmented, so my replies will be short and hopefully not too messy :) |
Sorry, something went wrong.
Well, again, IMO sqlite3 must behave as the io module: emit a ResourceWarning if a connection is not closed explicitly. All tests must explicitly close the connection. Closing a file must not depend if a sqlite object is part of a reference cycle or not. There are too many ways to create reference cycles in Python. |
Sorry, something went wrong.
In that case, I believe explicitly closing the connection and explicitly triggering gc.collect is the best thing to do here. |
Sorry, something went wrong.
|
NOBODY EXPECTS THE SPANISH INQUISITION (I'm Spanish so I can say that :) ) I think there is some confusion between @vstinner and @erlend-aasland. Let me try to clarify what are the expectations so is easier to reconduct the discussion:
One of the things that @vstinner is saying (and I very much agree with) is that we should never depend on calling Furthermore, @vstinner thinks that this implicit resource management (closing on destruction) is so bad that a warning should be emmited if we ever reach that code with the connection open, and all users should explicitly call In short:
|
Sorry, something went wrong.
|
I understand that it is possible that the database file remains open after sqlite3_close_v2() is called, if some other sqlite objects still exist. Oh, this is counter intuitive. How can a programmer know if a database file is closed or not? Try to delete it on Windows? :-) Would it help to emit a warning or even an hard exception if this case happen? |
Sorry, something went wrong.
|
Not the Spanish inquisition!!! 😂 Thanks, @pablogsal! :) I believe we're getting closer to understanding the various issues. Victor:
Yes, it is counter-intuitive, and no, there's no API for detecting if a database file is closed (AFAIK). |
Sorry, something went wrong.
|
I understand that the close() method must destroy/release/close all sqlite objects before calling sqlite3_close_v2(). Said differently, would it be possible to implement a public or private close() method on each Python sqlite object which somehow prevents to close a database (sqlite3_close_v2), to release the resources, without having to destroy the Python object? See my file object: even if the Python object is not destroyed, the inner resource is released. It requires all file methods to raise an exception if the file is closed. |
Sorry, something went wrong.
|
Anyway, I must take a break now; back at the computer in 5/6 hours :) |
Sorry, something went wrong.
The Python project is developed asynchronously. You don't have to reply in less than 1 hour. It's perfectly fine to take 1 week or even 1 month to reply to a review. Your message sends notifications which are counter-productive. I go to the PR to see the comment... to say that you will reply later, ah. On IRC, I only get a notification that you wrote a comment, but I don't get the content. |
Sorry, something went wrong.
- add wrapper for sqlite3_close_v2() - explicitly free pending statements before close()
|
Looks like I got it right. PTAL, @vstinner & @pablogsal. Thanks for your patience and guidance. |
Sorry, something went wrong.
|
The following methods check indirectly if the connection is closed. *_sqlite3.Connection.execute(): call self.cursor().execute()
Other methods call pysqlite_check_connection(). |
Sorry, something went wrong.
vstinner
left a comment
There was a problem hiding this comment.
Hum. Can you try to split your PR in two parts? First PR to rewrite the C code, second PR to rewrite the tests.
Sorry, something went wrong.
Fixed in 7d8014a.
Yes, I could do that, but that would make the CI fail on the first PR. |
Sorry, something went wrong.
...and sort imports
Hum, I am not strictly thinking about C vs Python, but more something like that: First PR:
Second PR:
|
Sorry, something went wrong.
|
This PR is no longer only about "Remove redundant cyclic GC hack in sqlite3" and "The sqlite3 module now fully implements the GC protocol; there's no need for this workaround anymore." It's now way more than that. |
Sorry, something went wrong.
|
True. Can I re-use the same issue number for both PRs? |
Sorry, something went wrong.
Yes. It's a good practice to put related changes in the same bpo. |
Sorry, something went wrong.

The sqlite3 module now fully implements the GC protocol; there's no
need for this workaround anymore.
https://bugs.python.org/issue42213