[Bug] degraded performance after upgrading from sglang v0.4.3 to v0.4.5

### Checklist

- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- [x] 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- [x] 5. Please use English, otherwise it will be closed.

### Describe the bug

I used a node H20-3e-141G to deploy DeepSeek-v3-0324 with sglang docker image v0.4.5-cu124. I ran sglang.bench_serving and found input token throughput decrease dramatically comparing with docker image v0.4.3

- Sglang v0.4.3
num_prompts 16
input token throughput 652TPS

- Sglang v0.4.5
num_prompts 16
input token throughput 198TPS

### Reproduction

The startup command is completely the same:

docker run -itd --gpus all --shm-size 500g -p 8000:8000 -v /data/0324:/data/deepseek-v3 --ipc=host --network=host --privileged=true lmsysorg/sglang:v0.4.5-cu124 python3 -m sglang.launch_server --model /data/deepseek-v3 --served-model-name deepseek-v3 --mem-fraction-static 0.95 --tp 8 --host 0.0.0.0 --port 8000 --max-total-tokens 65536  --trust-remote-code --enable-flashinfer-mla --enable-dp-attention --dp 2


Benchmark command for v0.4.3 and v0.4.5:

python3 -m sglang.bench_serving --backend sglang --host 0.0.0.0 --port 8000 --model deepseek-v3 --dataset-name random --num-prompts 16 --random-input 4096 --random-output 1024 --random-range-ratio 0.5 --dataset-path /data/ShareGPT_V3_unfiltered_cleaned_split.json

### Environment

GPU: H20-3e  141G *8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] degraded performance after upgrading from sglang v0.4.3 to v0.4.5 #5396

Checklist

Describe the bug

Reproduction

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] degraded performance after upgrading from sglang v0.4.3 to v0.4.5 #5396

Description

Checklist

Describe the bug

Reproduction

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions