Reproducible scaling laws for contrastive language-image learning [arXiv]

by Mehdi Cherti, Romain Beaumont, Ross Wightman, Mitchell Wortsman, Gabriel Ilharco, Cade Gordon, Christoph Schuhmann, Ludwig Schmidt, Jenia Jitsev [arXiv:2212.07143] (CVPR 2023)

In this repository, we provide the code for reproducing the experiments on large-scale CLIP pre-training and transfer to various downstream tasks for the paper "Reproducible scaling laws for contrastive language-image learning", together with pointers to open artefacts from this study (pre-trained models and all the intermediate checkpoints).

For follow-up work, see Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets, arXiv:2506.04598, research repo

Artefacts

Pre-trained openCLIP models across various scales, final checkpoints: https://huggingface.co/laion/scaling-laws-openclip/tree/main
All intermediate checkpoints for the pre-trained models (useful for studying training dynamics): https://huggingface.co/laion/scaling-laws-openclip/tree/main/full_checkpoints

Related resources

You may also check other related resources:

the OpenCLIP repository that points to the pre-trained models used in this study
the LAION-400m and LAION-5B composition instructions, the datasets used for openCLIP pre-training in this study
The instructions are also valid for the updated safety revision of those datasets, Re-LAION. Follow Re-LAION-research or Re-LAION-research-safe to obtain Re-LAION datasets for openCLIP experiments.
CLIP Benchmarking, transfer evaluation used in this study

Introduction

Scaling plots

To reproduce scaling plots from the paper, see the figures notebook.

Download pre-trained models

First, you need to clone the repo and install the requirements.

git clone https://github.com/LAION-AI/scaling-laws-openclip
cd scaling-laws-openclip
pip install -r requirements.txt

We provide a script, download_models.py, to download all pre-trained models used in the paper. To download all the 29 models used in the paper, use :

python download_models.py

You can also download a subset of the models. For instance:

python download_models.py --samples_seen 3B 13B --model ViT-B-32 --data 80M 400M 2B

will only download ViT-B/32 models with samples seen of 3B or 13B, trained on any of 80M/400M/2B LAION datasets.

Using pre-training models in OpenCLIP

Once you download the pre-trained models, you can also use them in OpenCLIP. Following is an example with ViT-H/14.

First, you need to download the model:

> python download_models.py --samples_seen 34B --model ViT-H-14 --data 2B

'Model-H-14_Data-2B_Samples-34B_lr-5e-4_bs-79k.pt' downloaded.

Once the model is downloaded, it is possible to directly use it in OpenCLIP:

import torch
import open_clip
model, _, preprocess = open_clip.create_model_and_transforms('ViT-H-14', pretrained='Model-H-14_Data-2B_Samples-34B_lr-5e-4_bs-79k.pt')

For a complete example, see the inference notebook.

Citation

If you find this work helpful, please cite our paper:

@inproceedings{cherti2023reproducible,
  title={Reproducible scaling laws for contrastive language-image learning},
  author={Cherti, Mehdi and Beaumont, Romain and Wightman, Ross and Wortsman, Mitchell and Ilharco, Gabriel and Gordon, Cade and Schuhmann, Christoph and Schmidt, Ludwig and Jitsev, Jenia},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={2818--2829},
  year={2023}
}

Acknowledgements

We would like to express gratitude to all the people who are working on making code, models and data publicly available, advancing community based research and making research more reproducible. Specifically, we would like to thank all the members of the LAION discord server community that was pivotal for the effort to compose LAION-400m and LAION-5B datasets without which this study would be impossible, and openAI for making their pre-trained CLIP models publicly available. We want to thank Hugging Face for providing hosting space for open datasets and models and Stability AI for providing supercomputing resources and storage space.

The authors gratefully acknowledge the Gauss Centre for Supercomputing e.V. for funding this work by providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS Supercomputer JUWELS Booster \citep{JUWELSBooster2020} at Jülich Supercomputing Centre (JSC). We also acknowledge storage resources on JUST granted and operated by JSC, as well as computing resources from the Helmholtz Data Federation (HDF). Further thank goes for support provided by JSC supercomputing facility administration team, especially to Damian Alvarez for his endurance and patience during the long "de-micing" sessions on JUWELS Booster.

Mehdi Cherti and Jenia Jitsev acknowledge partial funding by the Federal Ministry of Education and Research of Germany under BMBF grant no. 01IS22094B WestAI - AI Service Center West.

Special thanks goes also to Richard Vencu (LAION, Stability AI) for his on-going dedication towards enabling a HPC system and infrastructure around it that can be used by broad community of researchers and citizen scientists.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
README.md		README.md
arch_info.csv		arch_info.csv
download_models.py		download_models.py
figures.ipynb		figures.ipynb
imagenet_zeroshot_learning_curves.csv		imagenet_zeroshot_learning_curves.csv
inference.ipynb		inference.ipynb
requirements.txt		requirements.txt
trained_models_info.csv		trained_models_info.csv
zeroshot_results.csv		zeroshot_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Reproducible scaling laws for contrastive language-image learning [arXiv]

Artefacts

Related resources

Introduction

Scaling plots

Download pre-trained models

Using pre-training models in OpenCLIP

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

LAION-AI/scaling-laws-openclip

Folders and files

Latest commit

History

Repository files navigation

Reproducible scaling laws for contrastive language-image learning [arXiv]

Artefacts

Related resources

Introduction

Scaling plots

Download pre-trained models

Using pre-training models in OpenCLIP

Citation

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages