[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE… #6308

zhuzilin · 2025-05-15T03:03:06Z

…Method

Motivation

When doing RL training, we may release all the parameters with /release_memory_occupation to free the memory occupied by the inference engine, which will also released all the input_scales and weight_scales.

Modifications

The origin w13_weight_scale in UnquantizedEPMoEMethod does not support reloading (as the shape should be (num_experts_per_partition, 2)). And I found that for the UnquantizedEPMoEMethod, we don't need to instantiate w13_weight_scale and w13_input_scale, so removing them could be a better solution than allocating twice the origin memory.

And note that we do need to reload the w2_input_scale, because if we set that to None, it will be initialized to torch.ones during EpMoE.forward_normal. So I need to change the condition in _load_fp8_scale to allow loading w2_input_scale from a random value to 1.

Thank you for your time on reviewing this PR :)

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

…Method

fzyzcjy

LGTM, indeed making it none has one extra benefit: when doing EPLB shuffling, we no longer need to send these params between ranks

sgl-project#6308)

[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE…

b6919af

…Method

zhuzilin requested review from merrymercy, Ying1123, zhyncs, ispobock, HaiShaw and ch-wan as code owners May 15, 2025 03:03

Merge branch 'main' into feature/fix_deepseek_load

ca70fd2

zhaochenyang20 requested a review from BBuf as a code owner May 20, 2025 02:17

zhyncs and others added 3 commits May 19, 2025 23:09

Merge branch 'main' into feature/fix_deepseek_load

3a6ce0f

Merge branch 'main' into feature/fix_deepseek_load

22452b9

Merge branch 'main' into feature/fix_deepseek_load

b873416

zhyncs assigned fzyzcjy, ispobock and sleepcoo May 21, 2025

Merge branch 'main' into feature/fix_deepseek_load

35683af

fzyzcjy approved these changes May 21, 2025

View reviewed changes

sleepcoo approved these changes May 22, 2025

View reviewed changes

sleepcoo and others added 2 commits May 21, 2025 21:10

Merge branch 'main' into feature/fix_deepseek_load

33e6e12

Merge branch 'main' into feature/fix_deepseek_load

02b0966

zhyncs merged commit e9feb48 into sgl-project:main May 22, 2025
0 of 21 checks passed

Layssy pushed a commit to Layssy/sglang-iaas that referenced this pull request Jun 9, 2025

[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE… (

c8f7f52

sgl-project#6308)

xwu-intel pushed a commit to xwu-intel/sglang that referenced this pull request Jun 17, 2025

[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE… (

6b7839c

sgl-project#6308)

zhuzilin mentioned this pull request Jun 21, 2025

[sglang] Tracking sglang compatibility in slime THUDM/slime#6

Open

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE… #6308

[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE… #6308

Uh oh!

zhuzilin commented May 15, 2025 •

edited

Loading

Uh oh!

fzyzcjy left a comment

Uh oh!

Uh oh!

Uh oh!

[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE… #6308

[RL] Remove the w13 weight_scale and input_scale for UnquantizedEPMoE… #6308

Uh oh!

Conversation

zhuzilin commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

fzyzcjy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zhuzilin commented May 15, 2025 •

edited

Loading