Description
Checklist
- 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 2. Please use English, otherwise it will be closed.
Motivation
[2025-04-15 06:23:38 DP24 TP24] Scheduler hit an exception: Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/scheduler.py", line 1999, in run_scheduler_process
scheduler = Scheduler(server_args, port_args, gpu_id, tp_rank, dp_rank)
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/scheduler.py", line 249, in init
self.tp_worker = TpWorkerClass(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker_overlap_thread.py", line 63, in init
self.worker = TpModelWorker(server_args, gpu_id, tp_rank, dp_rank, nccl_port)
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 74, in init
self.model_runner = ModelRunner(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 178, in init
self.initialize(min_per_gpu_memory)
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 188, in initialize
self.load_model()
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 400, in load_model
self.model = get_model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/init.py", line 22, in get_model
return loader.load_model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/loader.py", line 365, in load_model
model = _initialize_model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/loader.py", line 146, in _initialize_model
return model_class(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1352, in init
self.model = DeepseekV2Model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1278, in init
[
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1279, in
DeepseekV2DecoderLayer(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1084, in init
self.mlp = DeepseekV2MLP(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 108, in init
self.gate_up_proj = MergedColumnParallelLinear(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/layers/linear.py", line 506, in init
super().init(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/layers/linear.py", line 361, in init
self.quant_method.create_weights(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/layers/quantization/fp8.py", line 238, in create_weights
raise ValueError(
ValueError: Weight output_partition_size = 288 is not divisible by weight quantization block_n = 128.
Related resources
No response