Closed
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 5. Please use English, otherwise it will be closed.
Describe the bug
Lora Backend Error
Bug report:
INFO 02-05 07:04:11 __init__.py:179] Automatically detected platform rocm.
WARNING 02-05 07:04:11 rocm.py:34] `fork` method is not supported by ROCm. VLLM_WORKER_MULTIPROC_METHOD is overridden to `spawn` instead.
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/sgl-workspace/sglang_public/python/sglang/bench_one_batch.py", line 60, in <module>
from sglang.srt.entrypoints.engine import _set_envs_and_config
File "/sgl-workspace/sglang_public/python/sglang/srt/entrypoints/engine.py", line 36, in <module>
from sglang.srt.managers.data_parallel_controller import (
File "/sgl-workspace/sglang_public/python/sglang/srt/managers/data_parallel_controller.py", line 31, in <module>
from sglang.srt.managers.scheduler import run_scheduler_process
File "/sgl-workspace/sglang_public/python/sglang/srt/managers/scheduler.py", line 81, in <module>
from sglang.srt.managers.tp_worker import TpModelWorker
File "/sgl-workspace/sglang_public/python/sglang/srt/managers/tp_worker.py", line 31, in <module>
from sglang.srt.model_executor.model_runner import ModelRunner
File "/sgl-workspace/sglang_public/python/sglang/srt/model_executor/model_runner.py", line 47, in <module>
from sglang.srt.lora.lora_manager import LoRAManager
File "/sgl-workspace/sglang_public/python/sglang/srt/lora/lora_manager.py", line 23, in <module>
from sglang.srt.lora.backend import TritonLoraBackend
File "/sgl-workspace/sglang_public/python/sglang/srt/lora/backend/__init__.py", line 2, in <module>
from .flashinfer_backend import FlashInferLoraBackend
File "/sgl-workspace/sglang_public/python/sglang/srt/lora/backend/flashinfer_backend.py", line 4, in <module>
from flashinfer import SegmentGEMMWrapper
ModuleNotFoundError: No module named 'flashinfer'
Issue PR: [Feature] Define backends and add Triton backend for Lora (#3161)
Issue Details:
https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/lora/backend/__init__.py
https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/lora/lora_manager.py#L23
This is not a selection; it is with flashinfer as well.
work with:
__init__:
#from .flashinfer_backend import FlashInferLoraBackend
__all__ = [
#"FlashInferLoraBackend",
"TritonLoraBackend",
]
lora_manager:
from sglang.srt.lora.backend import FlashInferLoraBackend, TritonLoraBackend -> from sglang.srt.lora.backend import TritonLoraBackend
Reproduction
python -m sglang.bench_one_batch --batch-size 32 --input 128 --output 32 --model deepseek-ai/DeepSeek-V3 --tp 8 --trust-remote-code
Environment
ROCM with latest code:
git clone https://github.com/sgl-project/sglang.git
cd sglang
pip install --upgrade pip
cd sgl-kernel
python setup_rocm.py install
cd ..
pip install -e "python[all_hip]"
Metadata
Metadata
Assignees
Labels
No labels