Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)

Official implementation of Bifrost-1, a unified framework that bridges pretrained multimodal LLMs (MLLMs) and diffusion models using patch-level CLIP image embeddings as implicit 2D image priors, which are natively aligned with the MLLM’s CLIP visual encoder.

Han Lin, Jaemin Cho, Amir Zadeh, Chuan Li, Mohit Bansal

🔧 Environment Setup

conda create -n bifrost1 python==3.11
conda activate bifrost1
pip install -r requirements.txt

🔮 Inference

📌 Model Checkpionts

The model checkpoint can be download from HuggingFace here.

You can download it to your specified local_dir with code:

from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="hanlincs/Bifrost-1",
    repo_type="model",
    local_dir="xxxxxxxx",
    local_dir_use_symlinks=False  
)

📌 Run Inference Scripts

Generate images from GenEval prompts

python inference_geneval_dpgbench.py --eval_geneval --output_dir "./outputs" --local_checkpoint_path XXXXX

📚 BibTeX

🌟 Please let us know in the issues or PRs if there's any questions. If you find our project useful in your research or application development, citing our paper would be the best support for us!

@inproceedings{linbifrost,
  title={Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents},
  author={Lin, Han and Cho, Jaemin and Zadeh, Amir and Li, Chuan and Bansal, Mohit},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}
}

🙏 Acknowledgements

The development of Bifrost-1 has been greatly inspired by the following amazing works and teams:

We hope that releasing this model/codebase helps the community to continue pushing these creative tools forward in an open and responsible way.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
bifrost		bifrost
evaluation/GenEval/prompts		evaluation/GenEval/prompts
README.md		README.md
inference_geneval_dpgbench.py		inference_geneval_dpgbench.py
pipeline_flux_controlnet.py		pipeline_flux_controlnet.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)

🔧 Environment Setup

🔮 Inference

📌 Model Checkpionts

📌 Run Inference Scripts

📚 BibTeX

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)

🔧 Environment Setup

🔮 Inference

📌 Model Checkpionts

📌 Run Inference Scripts

📚 BibTeX

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages