◐ Shell
reader mode source ↗
Skip to content

bpo-18835: Add PyMem_AlignedAlloc()#4089

Closed
vstinner wants to merge 1 commit into
python:masterfrom
vstinner:memalign
Closed

bpo-18835: Add PyMem_AlignedAlloc()#4089
vstinner wants to merge 1 commit into
python:masterfrom
vstinner:memalign

Conversation

@vstinner

@vstinner vstinner commented Oct 23, 2017

Copy link
Copy Markdown
Member
  • Add aligned_alloc and aligned_free fields to PyMemAllocatorEx

  • Rename PyMemAllocatorEx structure to PyMemAllocatorEx2 to make sure
    that C extensions are upgraded to fill the new aligned_alloc and
    aligned_free fields

  • Add 6 new functions:

    • PyMem_RawAlignedAlloc()
    • PyMem_RawAlignedFree()
    • PyMem_AlignedAlloc()
    • PyMem_AlignedFree()
    • PyObject_AlignedAlloc()
    • PyObject_AlignedFree()

https://bugs.python.org/issue18835

@vstinner vstinner changed the title bpo-18835: Add PyMem_AlignedAlloc() Oct 23, 2017
@vstinner

Copy link
Copy Markdown
Member Author

AppVeyor crashed with a heap corruption (exception c0000374) on:

test__testcapi (test.test_capi.Test_testcapi) ... Windows fatal exception: code 0xc0000374

@vstinner

Copy link
Copy Markdown
Member Author

TODO: _PyMem_DebugRawAlignedAlloc() is currently a copy of _PyMem_DebugRawAlloc(). Instead, it should trap such bug: PyMem_Free() called on a memory blocked allocated by PyMem_AlignedAlloc(). Maybe it should store the alignment as well, to dump it in a fatal error message?

@vstinner

Copy link
Copy Markdown
Member Author

AppVeyor crashed with a heap corruption (...)

I rebuild Python from scratch in Debug|x64: I am unable to reproduce the bug. Strange.

@vstinner

Copy link
Copy Markdown
Member Author

TODO: _PyMem_DebugRawAlignedAlloc() is currently a copy of _PyMem_DebugRawAlloc(). Instead, it should trap such bug: PyMem_Free() called on a memory blocked allocated by PyMem_AlignedAlloc(). Maybe it should store the alignment as well, to dump it in a fatal error message?

Done. PyMem_Free(PyMem_AlignedAlloc()) now fails as expected. I simply chose to use uppercase for the API identifier. I added an unit test to test exactly this.

I also documented newly added functions. I'm not sure of my documentation on the alignment:

+   Depending on the platform, not all values of *alignment* are accepted. Some
+   platforms require *alignment* to be a power of 2, others require *alignment*
+   to be a multiple of ``sizeof(void*)``.

Windows requires alignment to be "an integer power of 2". Maybe we should suggest to always use powers of 2 to best portability? I would be surprised if PyMem_AlignedAlloc() fails with an alignment equals to a power of 2.

@vstinner vstinner changed the title [WIP] bpo-18835: Add PyMem_AlignedAlloc() Oct 27, 2017
@vstinner

Copy link
Copy Markdown
Member Author

According to the discussion on https://bugs.python.org/issue18835 it seems like the feature (aligned memory allocations) is needed, so I finished my implementation:

  • new functions are now documented
  • new functions are now tested for corner cases
  • hooks on new functions are tested
  • debug hooks detect API misuse: PyMem_Free(PyMem_AlignedAlloc()), test required for Windows -- PyMem_Free(PyMem_AlignedAlloc()) does crash immediately on Windows! The debug hook helps to prevent such "portability" bug.
  • I simplified the documentation to: "alignment should be a power of 2 for the best portability."

The PR is now ready for a review.

@vstinner

Copy link
Copy Markdown
Member Author

I asked a colleague who works on the glibc: posix_memalign() cannot return NULL on success in the glibc.
https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=f94d51cca1b35952b15926883bb0256229ffb1fe;hb=HEAD#l5327

Extract:

5341   void *address = RETURN_ADDRESS (0);
5342   mem = _mid_memalign (alignment, size, address);
5343 
5344   if (mem != NULL)
5345     {
5346       *memptr = mem;
5347       return 0;
5348     }
5349 
5350   return ENOMEM;

By the way, posix_memalign() rejects alignment == 0 and alignement which is not a power of 2:

5333   /* Test whether the SIZE argument is valid.  It must be a power of
5334      two multiple of sizeof (void *).  */
5335   if (alignment % sizeof (void *) != 0
5336       || !powerof2 (alignment / sizeof (void *))
5337       || alignment == 0)
5338     return EINVAL;

Oh. PyObject_AlignedAlloc() doesn't reject allignement == 0. Should we reject it as well? (Return NULL in that case.)

@vstinner

Copy link
Copy Markdown
Member Author

Oh wow, on Windows, _aligned_malloc() "simply" crash with abort() if called with alignment=0 or alignment=3 (not a power of 2)!

@vstinner

Copy link
Copy Markdown
Member Author

I added a new commit to check alignment. New documentation:

+   *alignment* must be a power of 2 and greater than 0. If *alignment* is equal
+   to zero or is not a a power of 2, the function fails with an assertion error
+   in debug mode, or returns *NULL* in release mode.

@skrah

skrah commented Oct 27, 2017 via email

Copy link
Copy Markdown
Contributor

@vstinner

Copy link
Copy Markdown
Member Author

alignment % sizeof (void *) != 0

Oh. I checked and posix_memalign(2, 100) fails with EINVAL even if alignment (=2) is a power of 2.

I modified my PR to also check that in check_alignment().

I propose to always check the alignment in Python even if the glibc posix_memalign() fails nicely with EINVAL, to get the same behaviour on all platforms. For example, with my current PR, PyMem_RawAlignedAlloc(0, 100) returns NULL in release mode, rather than crashing (on abort()), on Windows.

@vstinner

Copy link
Copy Markdown
Member Author

@skrah: Does the PR look better with the new alignment checks?

@vstinner

Copy link
Copy Markdown
Member Author

@pitrou: Would you mind to review this big PR please?

@skrah

skrah commented Oct 27, 2017

Copy link
Copy Markdown
Contributor

I wish the call chains were easier to understand. This one ends up calling the
non-aligned version:

/* Execute this (--with-pydebug) */
char *foo = PyMem_AlignedAlloc(8, 10000);

/* Call chain */
PyMem_AlignedAlloc (alignment=alignment@entry=8, size=size@entry=10000) at Objects/obmalloc.c:585
_PyMem_DebugAlignedAlloc (ctx=0x898b40 <_PyMem_Debug+64>, alignment=8, size=10000) at Objects/obmalloc.c:1765
_PyMem_DebugRawAlignedAlloc (ctx=ctx@entry=0x898b40 <_PyMem_Debug+64>, alignment=alignment@entry=8, nbytes=nbytes@entry=10000)
    at Objects/obmalloc.c:1613
_PyObject_AlignedAlloc (ctx=0x0, alignment=8, size=10032) at Objects/obmalloc.c:1438
_PyObject_Alloc (use_calloc=use_calloc@entry=0, ctx=ctx@entry=0x0, nelem=nelem@entry=1, elsize=elsize@entry=10032) at Objects/obmalloc.c:906

   ==> Objects/obmalloc.c:1112    result = PyMem_RawMalloc(nbytes);

@vstinner

Copy link
Copy Markdown
Member Author
_PyObject_AlignedAlloc(void *ctx, size_t alignment, size_t size)
{
    if (alignment <= ALIGNMENT) {

@skrah: "alignment <= ALIGNMENT && size <= SMALL_REQUEST_THRESHOLD" should fix the call chain problem.

Wow, my code was completely wrong :-)

Before fixing this bug, I wrote an unit test to make sure that PyMem_AlignedAlloc() respects the alignment constraint...

Ooops again, _PyMem_DebugRawAlignedAlloc() was wrong too: it didn't align correctly the pointer... I fixed that too.

Sometimes, it helps to write unit tests ;-)

@vstinner

Copy link
Copy Markdown
Member Author

Oh ok, I understood the failure on Windows. _PyObject_AlignedAlloc() still fell back on the raw allocator rather than the aligned allocator, when pymalloc failed to allocated memory.

I had to rewrite the pymalloc allocator to give control to the caller how to fall back on another allocator.

@vstinner

Copy link
Copy Markdown
Member Author

This PR became too big, so I extracted two changes:

Once these PR will be merged, I will rebase this PR on top of master.

@vstinner

vstinner commented Nov 1, 2017

Copy link
Copy Markdown
Member Author

Oh, the Travis CI failed again with https://bugs.python.org/issue31910 I restarted the job.

* Add aligned_alloc and aligned_free fields to PyMemAllocatorEx
* Rename PyMemAllocatorEx structure to PyMemAllocatorEx2 to make sure
  that C extensions are upgraded to fill the new aligned_alloc and
  aligned_free fields
* Add 6 new functions:

  * PyMem_RawAlignedAlloc()
  * PyMem_RawAlignedFree()
  * PyMem_AlignedAlloc()
  * PyMem_AlignedFree()
  * PyObject_AlignedAlloc()
  * PyObject_AlignedFree()
@vstinner

vstinner commented Nov 1, 2017

Copy link
Copy Markdown
Member Author

I rebased my PR on top of master to get the fix for https://bugs.python.org/issue31910

@vstinner vstinner closed this Nov 24, 2017
@vstinner

Copy link
Copy Markdown
Member Author

https://bugs.python.org/issue18835 has been rejected, so I closed this PR.

@vstinner vstinner deleted the memalign branch November 24, 2017 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants