gh-111140: Adds PyLong_AsNativeBytes and PyLong_FromNative[Unsigned]Bytes functions#114886
gh-111140: Adds PyLong_AsNativeBytes and PyLong_FromNative[Unsigned]Bytes functions#114886zooba merged 22 commits into
Conversation
|
Sharing work so far for feedback/concerns. I like the unsigned/signed semantics (basically, using "unsigned" suppresses an error where you need one more bit to be a sign bit), and I dislike the mess needed to support byteswapping for non-native endian on arbitrary length buffers. But it's written now and seems to work, so I guess it can stay. The "FromByteArray" prototypes are a lie right now, but they're coming in next :) Probably need a "WithOptions" variant as well, though I have a feeling the implementation can't be any better than |
Sorry, something went wrong.
|
The only unhappy buildbot seems to be the s390x Fedora one, and that looks unrelated to this (if int conversions in posixmodule were incorrect, I'd expect to see other buildbots fail as well). Any further feedback? |
Sorry, something went wrong.
|
Thanks @erlend-aasland! Great feedback, and even found a bug (this is why we get reviews!) I left one open question about doc wording, as I know you're more broadly involved in that than I am. |
Sorry, something went wrong.
|
How should I use |
Sorry, something went wrong.
What you do in the overflow case is up to you. Maybe you can overallocate? Maybe you just have to fail. But you don't have to overallocate to fit precisely into a signed variable. For an unsigned variable, you may require one extra bit to store the sign bit, e.g. There really isn't a consistent way for us to handle this, lacking knowledge of what the value actually is. The current (private) function just fails on all negative values if you ask for unsigned, but that doesn't actually touch this issue at all - it just excludes a whole lot of otherwise valid conversions (e.g. So the suggestion there is if you know the magnitude of the value fits into your destination, or you don't care, then ignore positive returns from the function. But again, it only affects positive integers that require precisely the number of bits you've provided when treated as unsigned, and so treating as signed requires one extra. |
Sorry, something went wrong.
encukou
left a comment
There was a problem hiding this comment.
Thanks.
IMO it's hard to implement a conversion to native unsigned integer on top of this, but, that could be solved by adding PyLong_AsNativeUnsignedBytes. No need to block this PR.
I trust your judgment on int/*size_t as well; no need for that to block the PR.
Sorry, something went wrong.
I briefly implemented that. It took the size of the provided buffer, allocated a new one with one extra byte, did exactly the same thing, and copied the result into the original buffer or failed if it wasn't going to fit. It seems wasteful, especially compared to the caller doing a static allocation of a larger buffer themselves and calling this API, but we can't really improve on it that much - calculating that the resulting number is going to be in that one-bit worth of grey area isn't obvious. (And IMHO not offering it makes it more likely that people will implement it efficiently for themselves, rather than complaining that we did it inefficiently.) The thing that isn't trivial is to reject negative values entirely, which someone may want to do. But that's more likely to be an application requirement than an integration/adapter requirement, so I'd expect they'll have a comparison to zero in their own language before they convert anyway, which means they'll go to a native signed integer rather than directly to unsigned, or they'll do the comparison using a Python comparison. |
Sorry, something went wrong.
@encukou It didn't block, I changed them all to |
Sorry, something went wrong.
|
Looking this API over the main thing I don't like is... unrelated to the API. It appears to be doing the right thing twos compliment wise given that it is returning data in whole bytes with the sign extension filling an entire additional byte when necessary as \xff or \x00. I do find the need for an additional byte if someone is dealing exclusively with a value they already know to be non-negative yet large enough to occupy the high bit "annoying"... but this API isn't primarily for use by people wanting to merely fill a native type like that. They can add a range check of their own and be happy. ie: The prior understanding of the underlying value as discussed above. Thus my PR making the example not use it to fill in a mere Granted we could change to encouraging this API everywhere in the future instead of people forgetting to |
Sorry, something went wrong.
…gned]Bytes functions (pythonGH-114886)
edited by bedevere-app
Bot
LoadingUh oh!
There was an error while loading. Please reload this page.
Copy link Copy MarkdownSorry, something went wrong.
Uh oh!
There was an error while loading. Please reload this page.