You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[sglang] feat: Add SGLang async multi-turn rollout with tool support (volcengine#1037)
A redesigned version of volcengine#917
## Current Status
[Develop log &
Tracker](zhaochenyang20/Awesome-ML-SYS-Tutorial#113)
**What Has Been Done**
- Async Rollout Refactoring: Integrate with the tool server to
coordinate tool calls during generation, leveraging request IDs for
state and progress tracking, support async multi-turn conversations in
Agentic RL training (with Tool support).
- Async Request Management: Encapsulate rollout requests into a unified
structure, enabling efficient tracking and handling of concurrent
multi-turn dialogues with chatml style messages.
- Extensible Tools: A modular design for adapt tools in
OpenAIFunctionTool format which is both support by SGLang and vLLM, with
create separate instance, execute when tool call, calc score according
to tool env state and release resource.
- Multi-turn support has been implemented for the GSM8K task (new
version working on). However, training has not yet converged, and we
hope the community could join to investigate the issue.
**What Is WIP**
- [x] Merge loss mask to training process from last version
- [x] Add more user friendly tool config and e2e tests for gsm8k with
tool training
- [ ] We are going to validate our multiturn feature in open-source
sandbox environments.
## Key Features will be introduced in future version
- Integrate a Ray-based agent trainer to enable explicit separation of
the rollout and training pipeline. Provide support for partial rollout
handling and fine-grained request state management.
- Extend the framework to support simulated user interactions (e.g.,
roleplay, interactive feedback) and more complex environment-in-the-loop
RL tasks.
**Future Plan**
[Discussion
Thread](zhaochenyang20/Awesome-ML-SYS-Tutorial#74 (comment))
[RFC
doc](https://github.com/SwordFaith/verl-sglang-dev-log/blob/main/rlhf/verl/multi-turn/veRL-multiturn-rollout-RFC.md)
will be updated soon.
## Contributors & Acknowledgement
- Xiang Long [[email protected]](mailto:[email protected])
@SwordFaith (Design RFC & core-dev of refactor part)
- Yuzhen Zhou [[email protected]](mailto:[email protected])
@zyzshishui (Core-dev)
- Chenyang Zhao [[email protected]](mailto:[email protected])
@zhaochenyang20 (PM)
- Guanhua Wang @WANG-GH
- Junrong Lin @ocss884 (verl-sglang support)
- Hanchen Zhang
[[email protected]](mailto:[email protected])
- Haoran Wang [[email protected]](mailto:[email protected])
- Rui Lu [[email protected]](mailto:[email protected])
- Yujiang Li [[email protected]](mailto:[email protected])
- Jiajun Li [[email protected]](mailto:[email protected])
- Jin Pan [[email protected]](mailto:[email protected])
- Zhi Zheng [[email protected]](mailto:[email protected])
@zh-zheng
---------
Co-authored-by: zyzshishui <[email protected]>
Co-authored-by: guanhua <[email protected]>
Co-authored-by: zhaochenyang20 <[email protected]>
Co-authored-by: ocss884 <[email protected]>
Co-authored-by: Shawn/Yuxuan Tong <[email protected]>
Co-authored-by: HL <[email protected]>
0 commit comments