GH-80789: Bundle ``ensurepip`` wheels at build time by AA-Turner · Pull Request #109130 · python/cpython
Conversation
0aaf135 was an attempt to integrate this into the Makefile/PCBuild more 'properly', but I'm not confident in making wider-scale changes here (said commit didn't work, so I've reverted it).
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to need something for buildbots to avoid breaking them all, but I like the direction here.
| pythoninfo_script: | ||
| - cd build | ||
| - make pythoninfo | ||
| bundle_ensurepip_script: |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| bundle_ensurepip_script: | |
| # Download wheels for the venv step | |
| bundle_ensurepip_script: |
Is this the reason for adding it here?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything using venv or ensurepip during testing needs this provisioning step. If there's no uses, it'd be unnecessary.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual reason to include it is that Cirrus CI failed without this and passed with it. I don't know why Free BSD requires the venv step though, as all other tests pass normally -- only documentation and hypothesis (which use venvs) seem to have an explicit venv step.
A
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to need something for buildbots to avoid breaking them all, but I like the direction here.
I identified the place needing to be changed in #12791 (comment).
@AA-Turner FYI
I identified the place needing to be changed
I tried to explore a different approach of fully integrating this into make all in 0aaf135, which I think may be a better approach as it doesn't require downsteam distributors to change anything in their processes. I defer to those with much more experience with the build systems, though.
A
I tried to explore a different approach of fully integrating this into
make all
I tried doing that in my PR and was getting weird behavior, I wasn't sure why. Maybe I was misusing something in the build machinery or was just confused...
If that works, it's probably fine. OTOH, you may want to take into account that many distributions prefer their build envs to be disconnected from the internet so it might still be wise to provide them with means to pre-provision the wheel separately.
This PR doesn't attempt to switch to the one-project model, as I found the diff was too large to reasonably review.
Would #109245 help?
If that works, it's probably fine. OTOH, you may want to take into account that many distributions prefer their build envs to be disconnected from the internet so it might still be wise to provide them with means to pre-provision the wheel separately.
In Gentoo we aren't actually using the wheels originally bundled with CPython itself but providing the newest versions of the them separately, so as long as that continues working, I suppose that's fine with us. However, it is is paramount that:
- The build process and test suite fully respect
--with-wheel-pkg-dirand do not attempt to download anything. - This is entirely restricted to CPython build-time, and in particularly creating venv doesn't attempt any fetching.
However, I can imagine it could be mildly annoying to users who aren't used to the new workflow that having a complete git clone would no longer suffice to actually build CPython offline.
Same for Fedora. With an addition that we do sometimes keep the bundled wheels for too new or too old Pythons, so having a way to pre-populate the wheels instead of downloaidng them would still be needed.
@AA-Turner could you resolve the conflicts with this PR?
I think it would be good to have this be in Python 3.13's cycle as early as possible, to allow any potential redistributors who care about the details of ensurepip to have time to adjust things as well as provide us feedback on this as early as feasible.
| spec = importlib.util.spec_from_file_location("ensurepip", ENSURE_PIP_INIT) | ||
| ensurepip = importlib.util.module_from_spec(spec) | ||
| spec.loader.exec_module(ensurepip) | ||
| return ensurepip._PROJECTS |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AA-Turner I believe this would need to be updated after #109245, and it'll probably let you work with simpler structures.
| whl = response.read() | ||
| except URLError as exc: | ||
| print_error(f"Failed to download {wheel_url!r}: {exc}") | ||
| errors = 1 |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this meant to be a counter?
| continue | ||
| else: | ||
| print_error(f"An invalid '{name}' wheel exists.") | ||
| os.remove(wheel_path) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use pathlib's method? Seeing that it's already used everywhere..
| from urllib.error import URLError | ||
| from urllib.request import urlopen | ||
|
|
||
| HOST = 'https://files.pythonhosted.org' |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this something like a PYPI_CDN_URL for maintainability?
|
|
||
| def print_notice(message: str) -> None: | ||
| if GITHUB_ACTIONS: | ||
| print(f"::notice::{message}", end="\n\n") |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this also works if output to stderr FYI.
|
|
||
| try: | ||
| projects = _get_projects() | ||
| except (AttributeError, TypeError): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an abstraction leak here: it's hard to guess from looking at the _get_projects() function, what in it might trigger these exceptions. I'd recommend processing them inside that function and making obvious such places where the exceptions may occur. Instead, I'd convert both of the exceptions into something like an ImportError and handle just that. This would contribute to transparency of how that function works.
| return 1 | ||
|
|
||
| errors = 0 | ||
| for name, version, checksum in projects: |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW with #109245, looping will stop making sense here. So maybe you could start working with just projects[0] already?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, let's keep them separate.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was kinda assuming that the simplification PR would get merged first, in which case, this one would have to adapt..
|
|
||
| * Wheels for :mod:`ensurepip` are no longer bundled in the CPython source | ||
| tree. Distributors should bundle these as part of the build process by | ||
| running :file:`Tools/build/bundle_ensurepip_wheels.py`. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be clearer:
| running :file:`Tools/build/bundle_ensurepip_wheels.py`. | |
| running :file:`Tools/build/bundle_ensurepip_wheels.py` with no arguments. |
| # ensure the pip wheel exists | ||
| pip_filename = os.path.join(test.support.STDLIB_DIR, 'ensurepip', '_bundled', | ||
| f'pip-{ensurepip._PIP_VERSION}-py3-none-any.whl') | ||
| if not os.path.isfile(pip_filename): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like if it didn't exist, there should be some code to clean it up after testing so that the dummy file doesn't get forgotten on disk. Tests should avoid side effects...
| import bundle_ensurepip_wheels as bew | ||
|
|
||
|
|
||
| # Disable fancy GitHub actions output during the tests |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a test for enabled GHA, just for those helpers, then. It'd be sad not to get coverage there.
|
|
||
|
|
||
| def _is_valid_wheel(content: bytes, *, checksum: str) -> bool: | ||
| return checksum == sha256(content, usedforsecurity=False).hexdigest() |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closing this per #80789 (comment). Thanks @AA-Turner for filing this, even though we're not going ahead with this (despite me implying that earlier in the discussion).