Skip to content

Linux Py3.12 CPU CI fails to resolve torch-tensorrt when testing older pinned PyTorch (2.4.1 / 2.5.1) #21424

@littlebullGit

Description

@littlebullGit

Bug description

Summary

The Test PyTorch / pl-cpu workflow fails consistently on Linux + Python 3.12 for configurations pinned to PyTorch 2.4.1 and 2.5.1, due to an unsatisfiable dependency resolution involving torch-tensorrt.

This does not reproduce on macOS/Windows because the torch-tensorrt requirement is currently Linux-only.

Affected CI job(s)

  • Test PyTorch / pl-cpu (ubuntu-22.04, lightning, 3.12.7, 2.4.1)
  • Test PyTorch / pl-cpu (ubuntu-22.04, lightning, 3.12.7, 2.5.1)

Current behavior / error

CI fails during dependency installation (using uv pip install) with “No solution found” because the available torch-tensorrt wheels from the configured sources require newer torch versions than the job’s pinned torch.

Typical failure excerpt pattern:

  • Resolver only sees torch-tensorrt==2.6.x+cu126, 2.7.x+cu12x, 2.8.x+cu12x, 2.9.x+cu12x
  • Those versions depend on torch==2.6.0 or torch>=2.7.0 (etc.)
  • But CI pins torch==2.4.1 / torch==2.5.1

What version are you seeing the problem on?

v2.5

Reproduced in studio

No response

How to reproduce the bug

## Repro / relevant configuration
The failing install happens in `.github/workflows/ci-tests-pytorch.yml`:


uv pip install ".[${EXTRA_PREFIX}extra,${EXTRA_PREFIX}test,${EXTRA_PREFIX}strategies]" \
  --upgrade \
  --find-links="${TORCH_URL}" \
  --find-links="https://download.pytorch.org/whl/torch-tensorrt"


The requirement is currently in `requirements/pytorch/test.txt`:


torch-tensorrt; platform_system == "Linux" and python_version >= "3.12"

Error messages and logs

Run uv pip install ".[${EXTRA_PREFIX}extra,${EXTRA_PREFIX}test,${EXTRA_PREFIX}strategies]"
× No solution found when resolving dependencies:
╰─▶ Because only the following versions of torch-tensorrt{sys_platform ==
'linux'} are available:
torch-tensorrt{sys_platform == 'linux'}==2.6.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.6.1+cu126
torch-tensorrt{sys_platform == 'linux'}==2.7.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.7.0+cu128
torch-tensorrt{sys_platform == 'linux'}==2.8.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.8.0+cu128
torch-tensorrt{sys_platform == 'linux'}==2.8.0+cu129
torch-tensorrt{sys_platform == 'linux'}==2.9.0+cu126
torch-tensorrt{sys_platform == 'linux'}==2.9.0+cu128
torch-tensorrt{sys_platform == 'linux'}==2.9.0+cu130
and torch-tensorrt<=2.6.1+cu126 depends on torch==2.6.0, we can conclude
that torch-tensorrt{sys_platform == 'linux'}<2.6.1+cu126 depends on
torch==2.6.0.
And because torch-tensorrt==2.6.1+cu126 depends on torch==2.6.0, we
can conclude that torch-tensorrt{sys_platform == 'linux'}<2.7.0+cu126
depends on torch==2.6.0.
And because torch-tensorrt>=2.7.0+cu126,<=2.7.0+cu128 depends on
torch>=2.7.0,<2.8.0 and torch>=2.7.0,<2.8.0, we can conclude that
torch-tensorrt{sys_platform == 'linux'}<2.8.0+cu126 depends on one of:
torch==2.6.0
torch>=2.7.0,<2.8.0

Environment

Current environment
#- PyTorch Lightning Version (e.g., 2.5.0):
#- PyTorch Version (e.g., 2.5):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):

More info

Root cause analysis (why the resolver fails)

  1. torch-tensorrt is pulled unconditionally for Linux + Py3.12 test installs, including CPU-only CI jobs.
  2. CI also points the resolver at the PyTorch torch-tensorrt wheel index (https://download.pytorch.org/whl/torch-tensorrt).
  3. That index currently appears to publish only CUDA-tagged torch-tensorrt wheels (e.g. +cu126, +cu128, …) that require torch>=2.6 / >=2.7+.
  4. Therefore, for Linux + Py3.12 jobs pinned to torch==2.4.1 or torch==2.5.1, resolution is mathematically unsatisfiable.

Suggestion (keep TRT coverage; avoid breaking CPU jobs)

Update .github/workflows/ci-tests-pytorch.yml:

  1. Compute an env var (e.g. INSTALL_TORCH_TRT=1/0) based on the matrix torch version (and OS).
  2. If disabled, remove the torch-tensorrt line from requirements/pytorch/test.txt before running uv pip install ....
  3. Only add --find-links=https://download.pytorch.org/whl/torch-tensorrt when INSTALL_TORCH_TRT=1.

I can work on it if suggestion is OK.

cc @ethanwharris

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions