gh-111262: Add PyDict_Pop() function [with default value] by vstinner · Pull Request #111939 · python/cpython
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please show how it will be used in existing code instead of the old API? It will help to see what design is better.
For now, both uses of _PyDict_Pop_KnownHash() ignore the returning value. It means either that the design is not optimal or that the caller does not fully utilize the API.
Please show how it will be used in existing code instead of the old API? It will help to see what design is better.
Are you suggesting to return PyObject* and not provide the information if the key existed in the key, or if the default value was used?
You asked for this information:
It is difficult to distinguish two cases: return value removed from the dict and returning the increfed default value. If the dict can contain arbitrary values, you need to create a special sentinel value.
Do you want to suggest a different API?
For now, both uses of _PyDict_Pop_KnownHash() ignore the returning value. It means either that the design is not optimal or that the caller does not fully utilize the API.
Well, I wasn't sure if this PR should change the existing code "too much". I just pushed a change to use _PyDict_Pop_KnownHash() return value. As PyDict_GetItemRef(..., &value): you can ignore the return value and always use value to decide how to handle the 3 cases (error, not found, found).
I replaced usage of private _PyDict_Pop() with new public PyDict_Pop() to show how its return value can be useful to check for errors.
Change the API of the internal _PyDict_Pop_KnownHash() function to return an int. Co-Authored-By: Stefan Behnel <stefan_ml@behnel.de>
vstinner
changed the title
gh-106320: Add PyDict_Pop() function
gh-111262: Add PyDict_Pop() function
Oh, this PR is related to issue gh-111262. Before I mentioned the wrong issue number.
This looks OK to me API-wise, but as always, preferably wait for the C-API workgroup to review it. Please don't add it to the limited API before such a review.
Do we need PyDict_PopString as well?
My gut feeling tells me that the most important use case for this is a fire-and-forget "delete an item" kind of operation, where I do not care whether the key is found or not. This is at least a little less efficient than it should be, because users still need to pass an explicit default value (probably Py_None) and then PyDict_Pop() needs to incref it just to let the user decref it afterwards.
Why do we need to raise an exception at all? Why not return the default value regardless, i.e. return NULL if a user did not provide a default?
Options I see:
- return
1with a new reference as value if the key was found,0if the key is not found and the xincref-ed default is returned (and users probably know whether they passedNULLor not), and-1if an exception was raised with the value set toNULL. - do not let the users provide a default at all, return
1with a new reference if the value was found,0withNULLvalue if not found, and-1withNULLvalue for an exception.
I actually like the second option best, it seems quite simple. If users really want an exception to be raised, raising a KeyError with the key in their hands is entirely trivial.
That written, returning three different codes is somewhat annoying since it means that users need to put the return code in a variable first, rather than just writing if (PyDict_Pop() == -1) goto error;.
However, at least with my second option, the returned dict value is actually a proper point of distinction as well, so users could safely use this simple "error on -1" pattern and then check whether the value is NULL or not, or simply xdecref it if they don't care. Allowing both usages (check all three return codes vs. check error code and value) seems quite a user friendly API.
- do not let the users provide a default at all, return
1with a new reference if the value was found,0withNULLvalue if not found, and-1withNULLvalue for an exception.
+1 from me as well. Using a default value with dict.pop is, in my experience, only useful to avoid a KeyError.
Thank you Victor for showing how it can be used in the code (at least in our code). We now can see advantages and drawbacks of different designs.
I was going to write the same as @scoder. He just reads my mind. Most of uses cases here are trivial "delete without raising KeyError". The default value does not matter in this case. KeyError can be raised in only two cases where the default value can be NULL: in the implementations of the pop() method in dict and OrderedDict.
I propose to implement (perhaps in a separate PR) the variant of PyDict_Pop() without the default_value parameter. It should set value to NULL and return 0 without raising KeyError if no item found. Then compare two variants. Is there a significant difference in complexity or the number of lines of code?
Ok, and now something completely different: PR gh-112028
- Remove default_value parameter
- result argument can be NULL: it's common that the removed value is not used in the caller, see the PR
- Don't raise KeyError if the key is not present
Do we need PyDict_PopString as well?
If needed, it can be added later, but in Python code base, it doesn't seem to be needed. So far, nobody asked for such API.
but as always, preferably wait for the C-API workgroup to review it. Please don't add it to the limited API before such a review.
The Steering Council has been asked to take a decision on PEP 731 -- C API Working Group Charter 3 weeks ago, and so far, no decision was taken. I don't think that the lack of such working group should hold limited C API change. Can't we take decisions as a community until such group exists?
vstinner
changed the title
gh-111262: Add PyDict_Pop() function
gh-111262: Add PyDict_Pop() function [with default value]
Do we need
PyDict_PopStringas well?
It can be used at least in two places in the CPython codebase. Even if it was not used, it is worth to add it, because all similar functions have a *String pair. String keys are pretty common, and you can expect that C strings are used more in third-party code if they prefer simplicity over performance.
It can be used at least in two places in the CPython codebase.
Would you mind to point me where it could be used?
Can't we take decisions as a community until such group exists?
I don't think we should. This PR and others like it are setting precedents for new future C-API, and we do not agree on the details of the direction we want to take. We should be extra careful that the PRs adhere the guidelines we want to set, and we'll want to improve the guidelines based on the individual PRs.
Can't we take decisions as a community until such group exists?
Theoretically yes, but since we know that the group is coming, it's a bit rude to try and race ahead of it just in case it disagrees with your design.
As I proposed on the other PR, any new API going in now shouldn't be treated as precedent for guidelines going forward. It may end up as a wart, but that's the risk you take adding new things in this period. Help get the WG established quickly and things will start to be unblocked (and definitely don't add anything to the stable ABI in the meantime).
(and definitely don't add anything to the stable ABI in the meantime).
Ok.
By the way, 26 functions were added to Python 3.13 stable ABI. Should they be removed until the C API Working Group is created and decides what to do with them?
Theoretically yes, but since we know that the group is coming, it's a bit rude to try and race ahead of it just in case it disagrees with your design.
This PR is related to the removal of private functions in Python 3.13. Apparently, users are now eagger to test Python 3.13 as soon as alpha1, reported issues, and so I created #111481 and #112026 to try to mitigate issues.
Are you suggesting to freeze the C API for now?
By the way, 26 functions were added to Python 3.13 stable ABI.
These were all existing functions though, right? I'm not aware of anything brand new being added - we've all been saying to hold off on brand new APIs being marked long-term stable for months now.
You still get to use your judgement. If you judge that there's only one reasonable way to implement an API, then go ahead and take the risk. Just be aware that we are actively trying to plan how to plan the C API into the future, and so unilateral design decisions risk making a mockery of that plan, and by extension, the entire project. You're aware of the problems and proposed solutions that have been raised (see the issue repositories at https://github.com/capi-workgroup/ if not), and so anything falling in those areas is likely a good candidate for bringing to a discussion before committing us to support the design for the next 10+ years.
I close this PR: #112028 was merged with a different API.