Skip to content

[vulnops][data] fix: Replace unsafe pickle.loads with restricted unpickler in Qwen VL pipeline#3139

Draft
yaoyu-33 wants to merge 1 commit intomainfrom
security/fix-qwenvl-pickle
Draft

[vulnops][data] fix: Replace unsafe pickle.loads with restricted unpickler in Qwen VL pipeline#3139
yaoyu-33 wants to merge 1 commit intomainfrom
security/fix-qwenvl-pickle

Conversation

@yaoyu-33
Copy link
Copy Markdown
Contributor

@yaoyu-33 yaoyu-33 commented Apr 3, 2026

Summary

  • Add safe_pickle.py utility with _RestrictedUnpickler that only allows safe built-in types (list, dict, tuple, str, int, float, bool, bytes, etc.)
  • Replace pickle.loads() in Qwen VL videohandler with safe_pickle_loads() to prevent arbitrary code execution from malicious WebDataset shards
  • The restricted unpickler raises pickle.UnpicklingError if any non-whitelisted type is encountered

Test plan

  • Verify Qwen VL data loading still works with the restricted unpickler
  • Verify malicious pickle payloads are rejected
  • Run Qwen VL data pipeline tests

🤖 Generated with Claude Code

…Qwen VL data pipeline

The videohandler.__call__ method used pickle.loads() directly on data
from WebDataset shards, enabling arbitrary code execution via crafted
pickle payloads. Replace with a RestrictedUnpickler that only allows
safe built-in types (list, dict, tuple, str, int, float, etc.).

The restricted unpickler is placed in a shared utility module
(megatron.bridge.utils.safe_pickle) for reuse across other fixes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 3, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@yaoyu-33 yaoyu-33 changed the title [data] fix: Replace unsafe pickle.loads with restricted unpickler in Qwen VL pipeline [vulnops][data] fix: Replace unsafe pickle.loads with restricted unpickler in Qwen VL pipeline Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant