Skip to content

Improve dp attention port assignment scheme #5889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

jokerwyt
Copy link
Contributor

@jokerwyt jokerwyt commented Apr 29, 2025

Motivation

When we enable DP attention on many gpus (for example, 64) , the number of ports on node 0 we need is equal to the DP size. In many cases we need to share port space with others (such as container with hostnetwork, or baremetal), the possibility of port conflict is quite high.

Modifications

We get some free ports on node 0 and broadcast them to other nodes using dist_init_addr before assigning the zmq port from the DP controller to the scheduler with attn_tp_rank=0. We also move the port binding next to get_free_port to reduce the possibility of port conflict.

Test

Call for more tests on different settings, especially non-PD disaggregated settings.

2025-04-29 06:19:28,796 - pdutils - INFO - runCommand remotely: ssh -o StrictHostKeyChecking=no  ytw0 "PS1=[] source ~/.bashrc  && ( UCX_TLS=rc,gdr_copy,rc_x,cuda_copy,cuda_ipc UCX_NET_DEVICES=mlx5_bond_1:1,mlx5_bond_2:1,mlx5_bond_3:1,mlx5_bond_4:1,mlx5_bond_5:1,mlx5_bond_6:1,mlx5_bond_7:1,mlx5_bond_8:1 UCX_LOG_LEVEL=info NCCL_DEBUG=WARN SGLANG_PD_NIXL_DEBUG_TRANSFER_TIME=1 SGL_ENABLE_JIT_DEEPGEMM=0 python3.10 -m sglang.launch_server --host 0.0.0.0 --nnodes 2 --node-rank 0 --dist-init-addr ytw0:44725 --tp 16 --model-path /mnt/gemininjceph2/geminicephfs/mm-base-plt2/opensource_model/DeepSeek-R1_with_draft/DeepSeek-R1 --trust-remote-code --disable-radix-cache --schedule-policy fcfs --mem-fraction-static 0.70 --disable-overlap-schedule --chunked-prefill-size 32768  --log-level debug --enable-metrics --page-size 64 --disaggregation-mode prefill --disaggregation-transfer-backend nixl --disaggregation-bootstrap-port 5813 --max-running-requests 32 --port 40081 )"
2025-04-29 06:19:28,797 - pdutils - INFO - runCommand remotely: ssh -o StrictHostKeyChecking=no  ytw1 "PS1=[] source ~/.bashrc  && ( UCX_TLS=rc,gdr_copy,rc_x,cuda_copy,cuda_ipc UCX_NET_DEVICES=mlx5_bond_1:1,mlx5_bond_2:1,mlx5_bond_3:1,mlx5_bond_4:1,mlx5_bond_5:1,mlx5_bond_6:1,mlx5_bond_7:1,mlx5_bond_8:1 UCX_LOG_LEVEL=info NCCL_DEBUG=WARN SGLANG_PD_NIXL_DEBUG_TRANSFER_TIME=1 SGL_ENABLE_JIT_DEEPGEMM=0 python3.10 -m sglang.launch_server --nnodes 2 --node-rank 1 --dist-init-addr ytw0:44725 --tp 16 --model-path /mnt/gemininjceph2/geminicephfs/mm-base-plt2/opensource_model/DeepSeek-R1_with_draft/DeepSeek-R1 --trust-remote-code --disable-radix-cache --schedule-policy fcfs --mem-fraction-static 0.70 --disable-overlap-schedule --chunked-prefill-size 32768  --log-level debug --enable-metrics --page-size 64 --disaggregation-mode prefill --disaggregation-transfer-backend nixl --disaggregation-bootstrap-port 5813 --max-running-requests 32 --port 8417 )"
2025-04-29 06:19:28,797 - pdutils - INFO - runCommand remotely: ssh -o StrictHostKeyChecking=no  ytw2 "PS1=[] source ~/.bashrc  && ( UCX_TLS=rc,gdr_copy,rc_x,cuda_copy,cuda_ipc UCX_NET_DEVICES=mlx5_bond_1:1,mlx5_bond_2:1,mlx5_bond_3:1,mlx5_bond_4:1,mlx5_bond_5:1,mlx5_bond_6:1,mlx5_bond_7:1,mlx5_bond_8:1 UCX_LOG_LEVEL=info NCCL_DEBUG=WARN SGLANG_PD_NIXL_DEBUG_TRANSFER_TIME=1 SGL_ENABLE_JIT_DEEPGEMM=0 python3.10 -m sglang.launch_server --host 0.0.0.0 --nnodes 2 --node-rank 0 --dist-init-addr ytw2:16187 --enable-dp-attention --dp-size 16 --tp 16 --model-path /mnt/gemininjceph2/geminicephfs/mm-base-plt2/opensource_model/DeepSeek-R1_with_draft/DeepSeek-R1 --trust-remote-code --disable-radix-cache --schedule-policy fcfs --mem-fraction-static 0.70 --disable-overlap-schedule --chunked-prefill-size 32768  --log-level debug --enable-metrics --page-size 64 --disaggregation-mode decode --disaggregation-transfer-backend nixl --max-running-requests 32 --port 63339 )"
2025-04-29 06:19:28,797 - pdutils - INFO - runCommand remotely: ssh -o StrictHostKeyChecking=no  ytw3 "PS1=[] source ~/.bashrc  && ( UCX_TLS=rc,gdr_copy,rc_x,cuda_copy,cuda_ipc UCX_NET_DEVICES=mlx5_bond_1:1,mlx5_bond_2:1,mlx5_bond_3:1,mlx5_bond_4:1,mlx5_bond_5:1,mlx5_bond_6:1,mlx5_bond_7:1,mlx5_bond_8:1 UCX_LOG_LEVEL=info NCCL_DEBUG=WARN SGLANG_PD_NIXL_DEBUG_TRANSFER_TIME=1 SGL_ENABLE_JIT_DEEPGEMM=0 python3.10 -m sglang.launch_server --nnodes 2 --node-rank 1 --dist-init-addr ytw2:16187 --enable-dp-attention --dp-size 16 --tp 16 --model-path /mnt/gemininjceph2/geminicephfs/mm-base-plt2/opensource_model/DeepSeek-R1_with_draft/DeepSeek-R1 --trust-remote-code --disable-radix-cache --schedule-policy fcfs --mem-fraction-static 0.70 --disable-overlap-schedule --chunked-prefill-size 32768  --log-level debug --enable-metrics --page-size 64 --disaggregation-mode decode --disaggregation-transfer-backend nixl --max-running-requests 32 --port 23093 )"
2025-04-29 06:19:28,797 - __main__ - INFO - waiting for instance with log path /tmp/sgl-prefill-0-0.log to be ready...
2025-04-29 06:19:28,798 - pdutils - INFO - wait_server: ytw0:40081
2025-04-29 06:20:46,880 - __main__ - INFO - waiting for instance with log path /tmp/sgl-decode-0-0.log to be ready...
2025-04-29 06:20:46,880 - pdutils - INFO - wait_server: ytw2:63339
2025-04-29 06:21:13,911 - __main__ - INFO - All instances are ready! Wait some seconds to let the server warm up.
2025-04-29 06:21:23,921 - pdutils - INFO - runCommand remotely: ssh -o StrictHostKeyChecking=no  ytw0 "PS1=[] source ~/.bashrc  && ( python3.10 -m sglang.srt.disaggregation.mini_lb --prefill http://ytw0:40081 --decode http://ytw2:63339 --host 0.0.0.0 --port 11441 --prefill-bootstrap-ports 5813 )"

Checklist

@Qinyu-Xu
Copy link

Do you have any progress on this pr? @merrymercy @zhyncs @ByronHsu @jokerwyt

@jokerwyt
Copy link
Contributor Author

@Qinyu-Xu I have just resolved the conflict with the newest main. #6258 blocks me off testing. Once that issue is resolved I think we can test and merge this PR. Welcome to adopt this PR in your use case and share your experience.

@jokerwyt
Copy link
Contributor Author

Tested okay. Ready for review and merge.

@jokerwyt
Copy link
Contributor Author

Can we merge this? It's a little bit time-consuming...

@fzyzcjy @ch-wan

@fzyzcjy
Copy link
Collaborator

fzyzcjy commented May 29, 2025

The general idea LGTM, but I have no time to review the details now :( If you can find someone to review then it can usually be merged.

@ShangmingCai ShangmingCai requested review from ch-wan and removed request for HaiShaw May 29, 2025 02:45
@ch-wan ch-wan assigned ch-wan, ispobock and merrymercy and unassigned ch-wan May 29, 2025
@ch-wan
Copy link
Collaborator

ch-wan commented May 29, 2025

@jokerwyt This part was implemented by @merrymercy and @ispobock. I have added them to the review list.

@jokerwyt
Copy link
Contributor Author

jokerwyt commented Jun 7, 2025

@merrymercy @ispobock @zhyncs
Added a command arg and the test. Wait for an approval for CI/CD and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants