-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Improve dp attention port assignment scheme #5889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jokerwyt
wants to merge
23
commits into
sgl-project:main
Choose a base branch
from
jokerwyt:dp-port-dispatch
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
f96d599
feat: dynamic DP controller port dispatch
jokerwyt bfecdbb
Merge remote-tracking branch 'gh/main' into dp-port-dispatch
jokerwyt 60f8a55
Fix completions endpoint bootstrap port passing
jokerwyt 3e5d6ed
[WIP] dynamic DP port
jokerwyt 68fdf09
Dynamic DP port assignment
jokerwyt 2865267
Better dynamic port, lower conflict
jokerwyt 6f9eea5
small fix
jokerwyt c96c1b0
NIXL DP support (#5681)
jokerwyt e221b28
Remove some debug print
jokerwyt 6468136
Merge branch 'main' of https://github.com/sgl-project/sglang into dp-…
jokerwyt 8636070
Merge branch 'main' of https://github.com/sgl-project/sglang into dp-…
jokerwyt 0fa37d3
Merge branch 'main' into dp-port-dispatch
jokerwyt ad828c1
Atomic assignment of dp attention scheduler ports
jokerwyt ac7662b
Merge branch 'main' of https://github.com/sgl-project/sglang into dp-…
jokerwyt f5930a0
Merge branch 'dp-port-dispatch' of github.com:jokerwyt/sglang-public …
jokerwyt dc379a6
Refine
jokerwyt b21222a
Merge branch 'main' into dp-port-dispatch
jokerwyt 0872328
Merge branch 'main' into dp-port-dispatch
ispobock ee343ac
Merge branch 'main' into dp-port-dispatch
zhyncs b23a42f
Merge branch 'main' of https://github.com/sgl-project/sglang into dp-…
jokerwyt 50fc888
Add cmd args and test
jokerwyt 96ed813
Merge branch 'dp-port-dispatch' of github.com:jokerwyt/sglang-public …
jokerwyt 2a7f2d0
Merge branch 'main' into dp-port-dispatch
jokerwyt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
import unittest | ||
from types import SimpleNamespace | ||
|
||
from sglang.srt.utils import kill_process_tree | ||
from sglang.test.run_eval import run_eval | ||
from sglang.test.test_utils import ( | ||
DEFAULT_MLA_MODEL_NAME_FOR_TEST, | ||
DEFAULT_TIMEOUT_FOR_SERVER_LAUNCH, | ||
DEFAULT_URL_FOR_TEST, | ||
CustomTestCase, | ||
popen_launch_server, | ||
) | ||
|
||
|
||
class TestDPAttentionDP2TP2PortPicking(CustomTestCase): | ||
@classmethod | ||
def setUpClass(cls): | ||
cls.model = DEFAULT_MLA_MODEL_NAME_FOR_TEST | ||
cls.base_url = DEFAULT_URL_FOR_TEST | ||
cls.process = popen_launch_server( | ||
cls.model, | ||
cls.base_url, | ||
timeout=DEFAULT_TIMEOUT_FOR_SERVER_LAUNCH, | ||
other_args=[ | ||
"--trust-remote-code", | ||
"--tp", | ||
"2", | ||
"--enable-dp-attention", | ||
"--dp", | ||
"2", | ||
"--enable-torch-compile", | ||
"--torch-compile-max-bs", | ||
"2", | ||
"--pick-free-dp-port", | ||
], | ||
) | ||
|
||
@classmethod | ||
def tearDownClass(cls): | ||
kill_process_tree(cls.process.pid) | ||
|
||
def test_mmlu(self): | ||
args = SimpleNamespace( | ||
base_url=self.base_url, | ||
model=self.model, | ||
eval_name="mmlu", | ||
num_examples=64, | ||
num_threads=32, | ||
) | ||
|
||
metrics = run_eval(args) | ||
print(f"{metrics=}") | ||
self.assertGreater(metrics["score"], 0.5) | ||
|
||
def test_mgsm_en(self): | ||
args = SimpleNamespace( | ||
base_url=self.base_url, | ||
model=self.model, | ||
eval_name="mgsm_en", | ||
num_examples=None, | ||
num_threads=1024, | ||
) | ||
|
||
metrics = run_eval(args) | ||
print(f"{metrics=}") | ||
self.assertGreater(metrics["score"], 0.8) | ||
|
||
|
||
if __name__ == "__main__": | ||
unittest.main() |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.