[rollout] feat: follow OpenAI tool calling schema in chat scheduler #1831

wuxibin89 · 2025-06-03T17:28:25Z

Checklist Before Starting

Search for similar PR(s).

What does this PR do?

ChatCompletionScheduler interacts with async rollout server with OpenAI client, and follow OpenAI's tool calling schema. So any inference frameworks implementing OpenAI compatible server (e.g vllm, sglang) should works with ChatCompletionScheduler.

High-Level Design

Demonstrate the high-level design if this PR is complex.

Specific Changes

List the specific changes.

API

Demonstrate how the API changes if any.

Usage Example

Provide usage example(s) for easier usage.

# Add code snippet or script demonstrating how to use this

Test

For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluatuion results, etc.

Additional Info.

Issue Number: Fixes issue # or discussion # if any.
Training: [Note which backend this PR will affect: FSDP, Megatron, both, or none]
Inference: [Note which backend this PR will affect: vLLM, SGLang, both, or none]

Checklist Before Submitting

Read the Contribute Guide.
Apply pre-commit checks.
Add [BREAKING] to the PR title if it breaks any API.
Update the documentation about your changes in the docs.
Add CI test(s) if necessary.

casper-hansen · 2025-06-03T17:58:08Z

All in all, I think it's a great PR to abstract a bit of the chat scheduler away - however, I would recommend keeping extensiblity for the chat scheduler instead of removing it.

Here is why:

I compute specific amount of completion tokens left and have to set that in the sampling parameter kwargs.
I had a case where I needed to override the _postprocess to set enable_thinking for the Qwen3 models to work properly (I was getting empty <think></think> in a multi-turn scenario).
I also extensively use self.tokenizer.apply_chat_template to stay without context length in the case of calling tools many times and with a lot of context returned. Removing this would lead to less usability.

Btw, tools in training are also kind of hard to debug. Is there anything that can make this easier?

tests/workers/rollout/test_custom_completion_callback.py

wuxibin89 · 2025-06-04T07:44:12Z

@casper-hansen Thanks for your advise, I add a configurable option completion_callback in verl/trainer/config/ppo_trainer.yaml, which allow users to specify their own custom CompletionCallback implementation.

User should inherit CompletionCallback and implement two functions:

class CustomCompletionCallback(CompletionCallback):
    async def __call__(self, messages: List[Dict[str, str]], completions: ChatCompletion, info: Dict[str, Any]):
        ...

    def postprocess(self, batch: DataProto, batch_conversations: List[List[Dict[str, str]]], n: int) -> DataProto:
        ...

For reference, I add a unit test in test_custom_completion_callback.py to demonstrate how to write custom CompletionCallback.

casper-hansen · 2025-06-04T11:15:14Z

@wuxibin89 thanks for addressing the usage needs. I just looked through the code and confirmed I have access to the tokenizer, sampling paremeters, and postprocessing. So this looks good to me!

…olcengine#1831)

…anges (volcengine#1831)

…olcengine#1831)

wuxibin89 requested review from Irvingwangjr, SwordFaith, chenhaiq and vermouth1992 June 3, 2025 17:28

github-advanced-security bot found potential problems Jun 4, 2025

View reviewed changes

tests/workers/rollout/test_custom_completion_callback.py Dismissed Show dismissed Hide dismissed

wuxibin89 force-pushed the wuxibin/chat_scheduler branch from 325e26b to ba5fead Compare June 6, 2025 06:45

feifeibear mentioned this pull request Jun 6, 2025

[roadmap] Rollout Module Development Progress & Roadmap #1881

Closed

12 tasks

chenhaiq mentioned this pull request Jun 6, 2025

[roadmap] Rollout Module Development Progress & Roadmap #1882

Open

13 tasks

wuxibin89 added 5 commits June 6, 2025 16:48

[rollout] feat: unify tool calling in chat scheduler

dc682b2

fix response_mask

5ed3513

add custom completion callback test

e4a229b

fix chat scheduler

efe60b5

fix gsm8k async scripts

fd2c093

wuxibin89 force-pushed the wuxibin/chat_scheduler branch from ba5fead to fd2c093 Compare June 6, 2025 08:51

SwordFaith mentioned this pull request Jun 6, 2025

Multi-turn rollout & agentic RL Status & Roadmap zhaochenyang20/Awesome-ML-SYS-Tutorial#131

Open

27 tasks

waleko mentioned this pull request Jun 6, 2025

[rollout] feat: Add LangGraph chat scheduler for agent training with tool use support waleko/verl#1

Closed

vermouth1992 approved these changes Jun 6, 2025

View reviewed changes

vermouth1992 merged commit 457f4d2 into main Jun 6, 2025
38 checks passed

vermouth1992 deleted the wuxibin/chat_scheduler branch June 6, 2025 23:47

yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Jun 10, 2025

[rollout] feat: follow OpenAI tool calling schema in chat scheduler (v…

9a9b858

…olcengine#1831)

waleko mentioned this pull request Jun 10, 2025

[rollout] feat: Add LangGraph chat scheduler for agent training with tool use support #1946

Closed

6 tasks

waleko added a commit to waleko/verl that referenced this pull request Jun 10, 2025

Add langgraph chat scheduler tests; revert chat scheduler breaking ch…

3119b7a

…anges (volcengine#1831)

whatadayG pushed a commit to whatadayG/verl that referenced this pull request Sep 5, 2025

[rollout] feat: follow OpenAI tool calling schema in chat scheduler (v…

8f16b8a

…olcengine#1831)

chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025

[rollout] feat: follow OpenAI tool calling schema in chat scheduler (v…

5264e21

…olcengine#1831)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rollout] feat: follow OpenAI tool calling schema in chat scheduler #1831

[rollout] feat: follow OpenAI tool calling schema in chat scheduler #1831

Uh oh!

wuxibin89 commented Jun 3, 2025

Uh oh!

casper-hansen commented Jun 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

wuxibin89 commented Jun 4, 2025

Uh oh!

casper-hansen commented Jun 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[rollout] feat: follow OpenAI tool calling schema in chat scheduler #1831

[rollout] feat: follow OpenAI tool calling schema in chat scheduler #1831

Uh oh!

Conversation

wuxibin89 commented Jun 3, 2025

Checklist Before Starting

What does this PR do?

High-Level Design

Specific Changes

API

Usage Example

Test

Additional Info.

Checklist Before Submitting

Uh oh!

casper-hansen commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

wuxibin89 commented Jun 4, 2025

Uh oh!

casper-hansen commented Jun 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

casper-hansen commented Jun 3, 2025 •

edited

Loading