-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Bug description
Summary
The Test PyTorch / pl-cpu workflow fails consistently on Linux + Python 3.12 for configurations pinned to PyTorch 2.4.1 and 2.5.1, due to an unsatisfiable dependency resolution involving torch-tensorrt.
This does not reproduce on macOS/Windows because the torch-tensorrt requirement is currently Linux-only.
Affected CI job(s)
Test PyTorch / pl-cpu (ubuntu-22.04, lightning, 3.12.7, 2.4.1)Test PyTorch / pl-cpu (ubuntu-22.04, lightning, 3.12.7, 2.5.1)
Current behavior / error
CI fails during dependency installation (using uv pip install) with “No solution found” because the available torch-tensorrt wheels from the configured sources require newer torch versions than the job’s pinned torch.
Typical failure excerpt pattern:
- Resolver only sees
torch-tensorrt==2.6.x+cu126,2.7.x+cu12x,2.8.x+cu12x,2.9.x+cu12x… - Those versions depend on
torch==2.6.0ortorch>=2.7.0(etc.) - But CI pins
torch==2.4.1/torch==2.5.1
What version are you seeing the problem on?
v2.5
Reproduced in studio
No response
How to reproduce the bug
## Repro / relevant configuration
The failing install happens in `.github/workflows/ci-tests-pytorch.yml`:
uv pip install ".[${EXTRA_PREFIX}extra,${EXTRA_PREFIX}test,${EXTRA_PREFIX}strategies]" \
--upgrade \
--find-links="${TORCH_URL}" \
--find-links="https://download.pytorch.org/whl/torch-tensorrt"
The requirement is currently in `requirements/pytorch/test.txt`:
torch-tensorrt; platform_system == "Linux" and python_version >= "3.12"Error messages and logs
Run uv pip install ".[${EXTRA_PREFIX}extra,${EXTRA_PREFIX}test,${EXTRA_PREFIX}strategies]"
× No solution found when resolving dependencies:
╰─▶ Because only the following versions of torch-tensorrt{sys_platform ==
'linux'} are available:
torch-tensorrt{sys_platform == 'linux'}==2.6.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.6.1+cu126
torch-tensorrt{sys_platform == 'linux'}==2.7.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.7.0+cu128
torch-tensorrt{sys_platform == 'linux'}==2.8.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.8.0+cu128
torch-tensorrt{sys_platform == 'linux'}==2.8.0+cu129
torch-tensorrt{sys_platform == 'linux'}==2.9.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.9.0+cu128
torch-tensorrt{sys_platform == 'linux'}==2.9.0+cu130
and torch-tensorrt<=2.6.1+cu126 depends on torch==2.6.0, we can conclude
that torch-tensorrt{sys_platform == 'linux'}<2.6.1+cu126 depends on
torch==2.6.0.
And because torch-tensorrt==2.6.1+cu126 depends on torch==2.6.0, we
can conclude that torch-tensorrt{sys_platform == 'linux'}<2.7.0+cu126
depends on torch==2.6.0.
And because torch-tensorrt>=2.7.0+cu126,<=2.7.0+cu128 depends on
torch>=2.7.0,<2.8.0 and torch>=2.7.0,<2.8.0, we can conclude that
torch-tensorrt{sys_platform == 'linux'}<2.8.0+cu126 depends on one of:
torch==2.6.0
torch>=2.7.0,<2.8.0
Environment
Current environment
#- PyTorch Lightning Version (e.g., 2.5.0):
#- PyTorch Version (e.g., 2.5):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
More info
Root cause analysis (why the resolver fails)
torch-tensorrtis pulled unconditionally for Linux + Py3.12 test installs, including CPU-only CI jobs.- CI also points the resolver at the PyTorch torch-tensorrt wheel index (
https://download.pytorch.org/whl/torch-tensorrt). - That index currently appears to publish only CUDA-tagged
torch-tensorrtwheels (e.g.+cu126,+cu128, …) that requiretorch>=2.6/>=2.7+. - Therefore, for Linux + Py3.12 jobs pinned to
torch==2.4.1ortorch==2.5.1, resolution is mathematically unsatisfiable.
Suggestion (keep TRT coverage; avoid breaking CPU jobs)
Update .github/workflows/ci-tests-pytorch.yml:
- Compute an env var (e.g.
INSTALL_TORCH_TRT=1/0) based on the matrix torch version (and OS). - If disabled, remove the
torch-tensorrtline fromrequirements/pytorch/test.txtbefore runninguv pip install .... - Only add
--find-links=https://download.pytorch.org/whl/torch-tensorrtwhenINSTALL_TORCH_TRT=1.
I can work on it if suggestion is OK.