-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Explicitly set EntryType for file entries in tar #17043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Out of curiosity, did you file a bug with CPython for this? I suspect it's the kind of thing they'd be interested in fixing in a patch release for each non-EOL version. Edit: my bad, I see you filed it as python/cpython#141707! Linking here so these get cross-referenced 🙂 |
|
There's also a related issue in |
|
Do you have a reproducible example (ideally a standalone rust script) and a Python script that both use and CPython can use to check the problem and its fix? |
|
Here's a rust script that will create two tar files in the local directory: #!/usr/bin/env -S cargo +nightly -Zscript
---cargo
[dependencies]
tar = "0.4.44"
---
use std::fs::File;
use std::io::Error as IoError;
use std::path::Path;
use tar::Builder;
use tar::EntryType;
use tar::Header;
fn create_tar(filename: impl AsRef<Path>, entry_type: Option<EntryType>) -> Result<(), IoError> {
let file = File::options().write(true).create(true).truncate(true).open(filename)?;
let mut tarfile = Builder::new(file);
let entries = [
"file1".into(),
format!("{}/bbb", "a".repeat(99)),
"file2".into(),
];
for entry in entries {
let mut header = Header::new_gnu();
header.set_mode(0o644);
header.set_size(entry.len() as u64);
if let Some(entry_type) = entry_type {
header.set_entry_type(entry_type);
}
tarfile.append_data(&mut header, &entry, entry.as_bytes())?;
}
tarfile.finish()?;
Ok(())
}
fn main() -> Result<(), IoError> {
create_tar("good.tar", Some(EntryType::Regular))?;
create_tar("bad.tar", None)?;
Ok(())
}And here's a python script to test the two tar files: #!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.9"
# dependencies = [
# "pytest",
# ]
# ///
import tarfile
from io import BytesIO
import pytest
def get_members(filename):
with tarfile.open(filename, 'r') as tar:
return tar.getmembers()
def test_good_tar():
assert len(get_members("good.tar")) == 3
@pytest.mark.xfail(strict=True)
def test_bad_tar():
assert len(get_members("bad.tar")) == 3
@pytest.mark.parametrize('format', (
pytest.param(tarfile.GNU_FORMAT, id="gnu"),
pytest.param(tarfile.PAX_FORMAT, id="pax"),
))
@pytest.mark.xfail(strict=True)
def test_python_only(format):
fp = BytesIO()
with tarfile.open(mode='w', fileobj=fp, format=format) as tar:
info = tarfile.TarInfo()
info.type = tarfile.AREGTYPE
info.name = ("a" * 99) + "/bbb"
tar.addfile(info)
expected = {t.name: t.type for t in tar.getmembers()}
fp.seek(0)
with tarfile.open(mode='r', fileobj=fp) as tar:
actual = {t.name: t.type for t in tar.getmembers()}
assert expected == actual
if __name__ == "__main__":
pytest.main([__file__]) |
konstin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
I've added a comment explaining why we need to set that type. |
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.9.17` -> `0.9.18` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>astral-sh/uv (astral-sh/uv)</summary> ### [`v0.9.18`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0918) [Compare Source](astral-sh/uv@0.9.17...0.9.18) Released on 2025-12-16. ##### Enhancements - Add value hints to command line arguments to improve shell completion accuracy ([#​17080](astral-sh/uv#17080)) - Improve error handling in `uv publish` ([#​17096](astral-sh/uv#17096)) - Improve rendering of multiline error messages ([#​17132](astral-sh/uv#17132)) - Support redirects in `uv publish` ([#​17130](astral-sh/uv#17130)) - Include Docker images with the alpine version, e.g., `python3.x-alpine3.23` ([#​17100](astral-sh/uv#17100)) ##### Configuration - Accept `--torch-backend` in `[tool.uv]` ([#​17116](astral-sh/uv#17116)) ##### Performance - Speed up `uv cache size` ([#​17015](astral-sh/uv#17015)) - Initialize S3 signer once ([#​17092](astral-sh/uv#17092)) ##### Bug fixes - Avoid panics due to reads on failed requests ([#​17098](astral-sh/uv#17098)) - Enforce latest-version in `@latest` requests ([#​17114](astral-sh/uv#17114)) - Explicitly set `EntryType` for file entries in tar ([#​17043](astral-sh/uv#17043)) - Ignore `pyproject.toml` index username in lockfile comparison ([#​16995](astral-sh/uv#16995)) - Relax error when using `uv add` with `UV_GIT_LFS` set ([#​17127](astral-sh/uv#17127)) - Support file locks on ExFAT on macOS ([#​17115](astral-sh/uv#17115)) - Change schema for `exclude-newer` into optional string ([#​17121](astral-sh/uv#17121)) ##### Documentation - Drop arm musl caveat from Docker documentation ([#​17111](astral-sh/uv#17111)) - Fix version reference in resolver example ([#​17085](astral-sh/uv#17085)) - Better documentation for `exclude-newer*` ([#​17079](astral-sh/uv#17079)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi41Ny4xIiwidXBkYXRlZEluVmVyIjoiNDIuNTcuMSIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiUmVub3ZhdGUgQm90Il19-->
Summary
This PR explicitly sets the entry type for files in an sdist. This changes the entry type from
AREGTYPE(the 'legacy' regular file type) toREGTYPE(the 'normal' regular file type) in the generated tar.This change works around a bug in the python
tarfilemodule that causes all entries after a certain point in the tar to be silently ignored if any entry matches some very specific conditions. Inmaturinthis was very visible since thePKG-INFOwas written at the very end sotwine checkwould loudly complain that thePKG-INFOwas missing and that the sdist was invalid. InuvthePKG-INFOis written at the beginning so this issue is unlikely to be caught.Note that this change does mean that sdists created with newer versions of the uv build backend will not be byte-for-byte identical with sdists from an older version.
See PyO3/maturin#2855 (comment)
Test Plan
This is the same as the change that was made in maturin to work around the same issue