Skip to content

nvidia-china-sae/mair-hub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAIR-Hub

MAIR-Hub (MAIR stands for Multimodal AI Resources.) is a central repository for Multimodal AI Resources. This hub serves as a comprehensive collection of tutorials, code examples and other assets related to multimodal AI research and applications.

Repository Structure

The following directories contain specialized resources for different aspects of multimodal AI:

Directory Description
rl-tutorial Reinforcement Learning tutorials, including RL experiments with step-by-step guidance for reproduction
speech-llm Speech LLM training recipes, including Qwen-omni-like speech2speech model training etc.
external-resources Curated links to other valuable multimodal AI resources

RL-Tutorial

The rl-tutorial directory contains resources focused on reinforcement learning approaches in multimodal AI:

  • r1-zero: Tutorial of using the veRL framework to reproduce the reinforcement learning training process of DeepSeek-R1-Zero in the mathematics domain.
  • r1-like: Tutorial of using the openRLHF framework to reproduce the reinforcement learning training process of DeepSeek-R1 in the mathematics domain.
  • vlm-R1: Tutorial of using the veRL framework to train VLM models with reinforcement learning using both text and multimodal data to enhance reasoning capabilities in the mathematics domain.

Speech-LLM

The speech-llm directory provides resources for training Speech LLMs:

  • qwen-omni-like: Recipe for training Qwen2.5-Omni-style speech-to-speech (S2S) models.

External Resources

This section provides links to valuable external tutorials and resources related to multimodal AI:

Reasoning and Knowledge Distillation

  • Distilling DeepSeek R1 into Qwen: A tutorial demonstrating how to distill the reasoning abilities of DeepSeek R1 (a 671B parameter MoE model) into smaller models like Qwen using the NVIDIA NeMo 2.0 Framework. The repository includes notebooks for extracting reasoning data and training models with the distilled knowledge.

We are working on adding more tutorials and assets...

This project is licensed under the terms of the LICENSE file included in the repository.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 6