Skip to content

Cerebras/modelzoo

Repository files navigation

Cerebras ModelZoo

Cerebras banner

Introduction

The Cerebras ModelZoo is a collection of deep learning models and utilities optimized to run on Cerebras hardware. The repository provides reference implementations, configuration files, and utilities that demonstrate best practices for training and deploying models using Cerebras systems.

Key Features and Components

  • CLI: The ModelZoo CLI is a comprehensive command-line interface that serves as a single entry point for all ModelZoo-related tasks. It streamlines workflows such as data preprocessing, model training, and validation.
  • Models: Includes configuration files and reference implementations for a wide range of NLP, vision, and multimodal models, including Llama, Mixtral, DINOv2, and Llava. These are optimized for Cerebras hardware and follow best practices for performance and scalability.
  • Data Preprocessing Tools: Scripts and utilities for preparing datasets for training, including tokenization, formatting, and batching for supported models.
  • Checkpoint Converters and Porting Tools: Tools for converting between checkpoint formats (e.g., Cerebras ↔ HuggingFace) and porting PyTorch models to run on Cerebras systems.
  • Advanced Features: Support for training optimizations such as custom training loops, custom model implementations, µParam (μP) scaling, rotary position embedding (RoPE) scaling for extended sequence lengths, and more.

Ready to Get Started?

  • Reach out to us here to get access to Cerebras Hardware and ModelZoo!
  • Install Cerebras ModelZoo by following the steps in our setup guide.
  • Once you have ModelZoo installed, get started by pretraining or finetuning your first model!
  • Visit our developer documentation for comprehensive guides on everything you can do with Cerebras ModelZoo.

Models in this repository

Model Code pointer Model Code pointer
BERT Code BLOOM Code
BTLM Code DiT Code
DINOv2 Code DPO Code
DPR Code ESM-2 Code
Falcon Code Gemma 2 Code
GPT-2 Code GPT-3 Code
GPT-J & GPT-Neox Code Jais Code
LLaMA Code LLaVA Code
Mistral Code Mixtral of Experts Code
Multimodal Simple Code SantaCoder Code
StarCoder Code Transformer Code
T5 Code

License

Apache License 2.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages