Here's a new attempt, please let me know if this works out better.
Changes:
- Switched to CRT string functions (wcsncmp, wcscpy) instead of Windows lstrxxxW. There was no lstrncmpW.
- Switched to PyMem_Raw(Malloc|Free) and added explicit memset after allocation
- Better error handling (check arguments for NULL, check memory allocation)
- Fix possible overrun when checking if src_path starts with "\??\"
- Extensive commentary to describe the buffer sizing
Hope this works out better. I already have ideas for improvements, but I think we can try to get this in place first.