feat: add hf_template tokenize_and_mask method + verl SFTTrainer compat for RLLMSFTDataset by yifannnwu · Pull Request #485 · rllm-org/rllm

yifannnwu · 2026-04-04T04:03:28Z

Summary

Two changes to RLLMSFTDataset:

verl SFTTrainer compatibility: RLLMSFTDataset.__init__ now accepts processor and
max_samples kwargs, matching the call signature of verl's create_sft_dataset(). Without
this, using RLLMSFTDataset as custom_cls in verl's SFTTrainer(config) crashes with
TypeError: unexpected keyword argument.
hf_template tokenization method: The existing cumulative and stepwise methods use
rLLM's ChatTemplateParser to render messages, which renders tool calls as JSON-in-XML:
```
<tool_call>
{"name": "func", "arguments": {"key": "val"}}
</tool_call>
```
But models like Qwen3-Coder expect native XML format:
```
<tool_call>
<function=func>
<parameter=key>val</parameter>
</function>
</tool_call>
```
The new hf_template method uses tokenizer.apply_chat_template() directly, producing
the model's native format. It uses an incremental prefix-diff approach to isolate each
message's tokens for correct loss masking.

Config: data.rllm.tokenize_and_mask_method: hf_template

Files changed

rllm/trainer/verl/sft_dataset.py (~40 lines)

Test plan

Validated on 100 samples — tool calls correctly render in native XML format
Text-only samples render identically under all methods
SFT training with verl's SFTTrainer(config) + custom_cls: RLLMSFTDataset works

1. RLLMSFTDataset.__init__ now accepts processor and max_samples kwargs, matching verl's create_sft_dataset() call signature. Without this, using RLLMSFTDataset as custom_cls with verl's SFTTrainer(config) crashes with TypeError. 2. Add hf_template tokenization method that uses tokenizer.apply_chat_template() directly instead of rLLM's ChatTemplateParser. The existing cumulative/stepwise methods render tool calls as JSON-in-XML, which is wrong for models with native XML tool call format (e.g. Qwen3-Coder). The hf_template method produces the model's native format. Config: data.rllm.tokenize_and_mask_method: hf_template

listar2000 · 2026-04-04T16:33:35Z

@jeffreysijuntan @kylemontgomery1

kylemontgomery1 · 2026-04-04T18:30:05Z

Looks good to me!

kylemontgomery1 merged commit e5b81c1 into rllm-org:main Apr 4, 2026
0 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add hf_template tokenize_and_mask method + verl SFTTrainer compat for RLLMSFTDataset#485

feat: add hf_template tokenize_and_mask method + verl SFTTrainer compat for RLLMSFTDataset#485
kylemontgomery1 merged 1 commit into
rllm-org:mainfrom
yifannnwu:feat/sft-hf-template

yifannnwu commented Apr 4, 2026

Uh oh!

listar2000 commented Apr 4, 2026

Uh oh!

kylemontgomery1 commented Apr 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yifannnwu commented Apr 4, 2026

Summary

Files changed

Test plan

Uh oh!

listar2000 commented Apr 4, 2026

Uh oh!

kylemontgomery1 commented Apr 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants