Skip to content

[Feature] How to deploy deepseek ep_size=64 on h100? #5405

Closed
@CSEEduanyu

Description

@CSEEduanyu

Checklist

Motivation

[2025-04-15 06:23:38 DP24 TP24] Scheduler hit an exception: Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/scheduler.py", line 1999, in run_scheduler_process
scheduler = Scheduler(server_args, port_args, gpu_id, tp_rank, dp_rank)
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/scheduler.py", line 249, in init
self.tp_worker = TpWorkerClass(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker_overlap_thread.py", line 63, in init
self.worker = TpModelWorker(server_args, gpu_id, tp_rank, dp_rank, nccl_port)
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 74, in init
self.model_runner = ModelRunner(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 178, in init
self.initialize(min_per_gpu_memory)
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 188, in initialize
self.load_model()
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 400, in load_model
self.model = get_model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/init.py", line 22, in get_model
return loader.load_model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/loader.py", line 365, in load_model
model = _initialize_model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/loader.py", line 146, in _initialize_model
return model_class(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1352, in init
self.model = DeepseekV2Model(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1278, in init
[
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1279, in
DeepseekV2DecoderLayer(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 1084, in init
self.mlp = DeepseekV2MLP(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/deepseek_v2.py", line 108, in init
self.gate_up_proj = MergedColumnParallelLinear(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/layers/linear.py", line 506, in init
super().init(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/layers/linear.py", line 361, in init
self.quant_method.create_weights(
File "/usr/local/lib/python3.10/dist-packages/sglang/srt/layers/quantization/fp8.py", line 238, in create_weights
raise ValueError(
ValueError: Weight output_partition_size = 288 is not divisible by weight quantization block_n = 128.

Related resources

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions