◐ Shell
clean mode source ↗

`.pyc` files are larger than they need to be

Python 3.11 made .pyc files almost twice as large. There are two main reasons for this:

  • PEP 659 made the bytecode stream ~3x as large as 3.10.
  • PEP 657 made the location tables ~9x as large as 3.10.

(Note that these effects compound each other, since longer bytecode means more location entries.)

However, there is low-hanging fruit for improving this situation in 3.12:

  • Bytecode can be compressed using a fairly simple scheme (one byte for instructions without an oparg, two bytes for instructions with an oparg, and zero bytes for CACHE entries). This results in serialized bytecode that is ~66% smaller than 3.11.
  • The location table format already has a mechanism for compressing multiple code units into a single entry. Currently it's only used for EXTENDED_ARGs and CACHEs corresponding to a single instruction, but with slight changes the compiler can use the same mechanism to share location table entries between adjacent instructions. This is a double-win, since it not only makes .pyc files smaller, but also shrinks the memory footprint of all code objects in the process. Experiments show that this makes location tables ~33% smaller than 3.11.

When both of these optimizations are applied, .pyc files become ~33% smaller than 3.11. This is only ~33% larger than 3.10, despite all of the rich new debugging information present.

Linked PRs