Skip to content

.pyc files are larger than they need to be #99554

@brandtbucher

Description

@brandtbucher
Member

Python 3.11 made .pyc files almost twice as large. There are two main reasons for this:

  • PEP 659 made the bytecode stream ~3x as large as 3.10.
  • PEP 657 made the location tables ~9x as large as 3.10.

(Note that these effects compound each other, since longer bytecode means more location entries.)

However, there is low-hanging fruit for improving this situation in 3.12:

  • Bytecode can be compressed using a fairly simple scheme (one byte for instructions without an oparg, two bytes for instructions with an oparg, and zero bytes for CACHE entries). This results in serialized bytecode that is ~66% smaller than 3.11.
  • The location table format already has a mechanism for compressing multiple code units into a single entry. Currently it's only used for EXTENDED_ARGs and CACHEs corresponding to a single instruction, but with slight changes the compiler can use the same mechanism to share location table entries between adjacent instructions. This is a double-win, since it not only makes .pyc files smaller, but also shrinks the memory footprint of all code objects in the process. Experiments show that this makes location tables ~33% smaller than 3.11.

When both of these optimizations are applied, .pyc files become ~33% smaller than 3.11. This is only ~33% larger than 3.10, despite all of the rich new debugging information present.

Linked PRs

Activity

added
performancePerformance or resource usage
interpreter-core(Objects, Python, Grammar, and Parser dirs)
stdlibPython modules in the Lib dir
3.12only security fixes
on Nov 17, 2022
self-assigned this
on Nov 17, 2022
added a commit that references this issue on Dec 22, 2022
3c033a2
added a commit that references this issue on Dec 22, 2022
09edde9
added a commit that references this issue on Dec 28, 2022
stonebig

stonebig commented on Apr 25, 2023

@stonebig

as a remark, packing a directory of distros of about 600 same packages on windows:

  • with .7z compaction, python-3.11 one is 5% bigger than the 3.10 one (754 mo vs 713 Mo)
  • with no compaction it's 9% bigger (3.9 Go vs 3.57 Go)
  • sympy-1.11.1 is the most inflated package by about 90% (117 Mo vs 61 Mo), and only shrinks back to 101 Mo with 3.12a7
hugovk

hugovk commented on Mar 15, 2024

@hugovk
Member

Triage: can this issue be closed or is there more to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

3.12only security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagestdlibPython modules in the Lib dir

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @hugovk@stonebig@brandtbucher

      Issue actions

        `.pyc` files are larger than they need to be · Issue #99554 · python/cpython