We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Next-Token Prediction is All You Need
Python 2.2k 81
Emu Series: Generative Multimodal Models from BAAI
Python 1.7k 85
EVA Series: Visual Representation Fantasies from BAAI
Python 2.5k 185
Painter & SegGPT Series: Vision Foundation Models from BAAI
Python 2.6k 179
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
Python 589 39
EVE Series: Encoder-Free Vision-Language Models from BAAI
Python 332 8
There was an error while loading. Please reload this page.
MTVCraft: An Open Veo3-style Audio-Video Generation Demo
Unified Vision-Language-Action Model
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
[CVPR'25 Highlight] You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
[ICLR 2025 Spotlight] An open-sourced LLM judge for evaluating LLM-generated answers.
[ICLR 2025] Diffusion Feedback Helps CLIP See Better
[ECCV 2024] Tokenize Anything via Prompting