Skip to content

Build backend tracking issue #8779

@konstin

Description

@konstin

uv should provide its own build backend. This issue tracks this work, it will become more granular as more things are implemented.

The uv build backend is currently in preview. It is used with uv init --lib --preview, and has a direct build fast path in uv build --preview.

Documentation:

TODO

Background reading

Including and excluding files in packages

To go from the source code to an installed (or published) package, we always start with a source tree (e.g. a directory in a repo, mostly the repo root), which is identified by containing a pyproject.toml with a build-system and a project section. To install the project, we always have to go through a wheel. Getting to the wheel is the task of the build backend, installing the wheel is a different part of uv (a well-defined one). From the source tree, we can then either build a source distribution and from that wheel, or directly a wheel. This means that the source distribution must contain all files needs for the wheel.

The source dist has an indirection where there’s a <name>-<version> directory at the root and everything is below it, but that’s an implementation details. For our purposes, source dists and wheels have a root directory we can add to.

A source distribution usually contains a subset of the source tree in its root, excluding generated and cache directories (.venv, .pytest_cache, etc.) and development files (tests, test data, CI, etc.), while including the main python module, certain metadata files (pyproject.toml, readme and its images, licenses) and crucial data files and blobs (sample dataframes in pandas, manylinux json in maturin, lists of known endpoints, db schemas, headers for c projects, launcher scripts, etc.) that may either live next to the source code or in one of the dedicated data dirs below.

PEP 639 defines license file globs such as project.license-files = ["third-party/LICEN[CS]E*", "AUTHORS*"], which we must support as given. These files have to be copied to the root for the source dist, and to <name>-<version>.dist-info/licenses in the wheel. In the source dist, we have to include the readme if linked from project.readme, in the wheel it becomes part of METADATA.

Our main module usually exists at src/<name> , or alternatively at <name>. For the src/<name> layout, it needs to move to <name> in the wheel (recursive directory copy). This directory may contain python source files, files used by the source files (say some json with endpoints or a db schema sql) and files that we should skip such as .pyc and __pycache__.

Wheels (but not source dists) allow data directories in <name>-<version>.data/<type> , five different predefined ones. We have to allow the user to define which directory/files to include here, and then also copy those to the source dist.

A special case are native modules (.so/.pyd), if we want to support them. These may exist in the source tree for development (esp. editables), but must not be copied to the source dist, but must be generated from the source dist and added to the wheel.

We may want to allow the user to include different files in source tree → source dist than in source tree (repo or unpacked from a source dist) → wheel, especially when a build or code generation step becomes involved.

At the top level, the wheel must only contain a single module (we don’t support wheels with more than one top level module), so there are no custom include patterns for wheels: The wheels contains dist info (including license files), data files and (potentially with exclusions) the root module directory.

Even through all this, the majority of projects will want three features: A Readme (potentially with a transform for pypi), license file(s) and a src/<name> directory. These can be covered by the right defaults, so most users shouldn’t need to change the default includes/excludes.

Tool Review

poetry

By default, uses gitignore. Using include makes it ignore gitignore.

[tool.poetry]
include = [
    { path = "tests", format = "sdist" },
    { path = "for_wheel.txt", format = ["sdist", "wheel"] }
]

If no format is specified, include defaults to only sdist.

In contrast, exclude defaults to both sdist and wheel.

pdm

includes (wheel), source-includes and package-dir.

If a file is covered by both includes and excludes, the one with the more path parts and less wildcards in the pattern wins, otherwise excludes takes precedence if the length is the same.

For example, given the following configuration:

includes = ["src"]
excludes = ["**/*.json"]

src/foo/data.json will be excluded since the pattern in excludes has more path parts, however, if we change the configuration to:

includes = ["src", "src/foo/data.json"]
excludes = ["**/*.json"]

the same file will be included since it is covered by includes with a more specific path.

Test files under tests, if found, are included by sdist and excluded by other formats.

*.pyc, __pycache__/ and build/ are always excluded.

hatch(ling)

Respects gitignore and hgignore by default, ignore-vcs to ignore.

Include, then exclude, every entry represents a [Git-style glob pattern](https://git-scm.com/docs/gitignore#_pattern_format), uses pathspec.GitIgnoreSpec.from_lines internally.

[tool.hatch.build.targets.sdist]
include = [
  "pkg/*.py",
  "/tests",
]
exclude = [
  "*.json",
  "pkg/_compat.py",
]

You can use the only-include option to prevent directory traversal starting at the project root and only select specific relative paths to directories or files. Using this option ignores any defined include patterns.

There is an artifacts option to include gitignored files.

There is a hardcoded set of excluded directories (.git, __pycache__, etc.) and files (.DS_Store).

There is a skip-excluded-dirs for performance and only-include for only traversing certain directories.

maturin

Include and exclude are inspired by poetry.

include = [
  { path = "path/**/*", format = "sdist" },
  { path = "all", format = ["sdist", "wheel"] },
  { path = "for/wheel/**/*", format = "wheel" }
]

scikit-build-core

Uses gitignore by default, you can specify sdist includes and excludes, and wheel excludes.

For packages, it supports renames (last component must match):

[tool.scikit-build.wheel.packages]
"mypackage/subpackage" = "python/src/subpackage"

cargo

Include and exclude with gitignore syntax. By default, all files are included, not just src, but when specifying include manually, it will ignore src by default.

If include is not specified, then the following files will be excluded:

  • If the package is not in a git repository, all “hidden” files starting with a dot will be skipped.
  • If the package is in a git repository, any files that are ignored by the [gitignore](https://git-scm.com/docs/gitignore) rules of the repository and global git configuration will be skipped.

Regardless of whether exclude or include is specified, the following files
are always excluded:

  • Any sub-packages will be skipped (any subdirectory that contains a Cargo.toml file).
  • A directory named target in the root of the package will be skipped.

The following files are always included:

npm

There is a files list for includes with gitignore syntax, by default .gitignore is used but .npmignore takes precedence. There are mandatory includes, there are default excludes, and there are mandatory excludes.

Ecosystem Review

A random assortment of projects and syntaxes as data points.

boto3

include CONTRIBUTING.rst
include README.rst
include LICENSE
include requirements.txt
recursive-include boto3/data *.json

httpx

[tool.hatch.build.targets.sdist]
include = [
    "/httpx",
    "/CHANGELOG.md",
    "/README.md",
    "/tests",
]

charset-normalizer

include LICENSE README.md CHANGELOG.md charset_normalizer/py.typed dev-requirements.txt
recursive-include data *.md
recursive-include data *.txt
recursive-include docs *
recursive-include tests *

idna

[tool.flit.sdist]
exclude = [".gitignore", ".github/"]
include = ["tests", "tools", "HISTORY.rst"]

typing-extensions

[tool.flit.sdist]
include = ["CHANGELOG.md", "README.md", "tox.ini", "*/*test*.py"]
exclude = []

django

include AUTHORS
include Gruntfile.js
include INSTALL
include LICENSE
include LICENSE.python
include MANIFEST.in
include package.json
include tox.ini
include *.rst
graft django
graft docs
graft extras
graft js_tests
graft scripts
graft tests
global-exclude *.py[co]

pydantic

[tool.hatch.build.targets.sdist]
# limit which files are included in the sdist (.tar.gz) asset,
# see https://github.com/pydantic/pydantic/pull/4542
include = [
    '/README.md',
    '/HISTORY.md',
    '/Makefile',
    '/pydantic',
    '/tests',
]

scikit-learn

Programmatically with meson, i think

spacy

recursive-include spacy *.pyi *.pyx *.pxd *.txt *.cfg *.jinja *.toml *.hh
include LICENSE
include README.md
include pyproject.toml
include spacy/py.typed
recursive-include spacy/cli *.yml
recursive-include licenses *
recursive-exclude spacy *.cpp

auditwheel

include README.rst
include LICENSE
include CHANGELOG.md
include src/auditwheel/policy/*.json
include src/auditwheel/_vendor/wheel/LICENSE.txt

graft tests

exclude .coveragerc
exclude .gitignore
exclude .git-blame-ignore-revs
exclude .pre-commit-config.yaml
exclude .travis.yml
exclude noxfile.py

prune .github
prune scripts
prune tests/**/__pycache__
prune tests/**/*.egg-info
prune tests/**/build

global-exclude *.so .DS_Store

ripgrep

exclude = [
  "HomebrewFormula",
  "/.github/",
  "/ci/",
  "/pkg/brew",
  "/benchsuite/",
  "/scripts/",
]

alphafold3

[tool.scikit-build]
wheel.exclude = [
    "**.pyx",
    "**/CMakeLists.txt",
    "**.cc",
    "**.h"
]
sdist.include = [
    "LICENSE",
    "OUTPUT_TERMS_OF_USE.md",
    "WEIGHTS_PROHIBITED_USE_POLICY.md",
    "WEIGHTS_TERMS_OF_USE.md",
]

watchfiles

Metadata

Metadata

Assignees

Labels

build-backendenhancementNew feature or improvement to existing functionalitypreviewExperimental behaviortrackingA "meta" issue that tracks completion of a bigger task via a list of smaller scoped issues.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions