Skip to content

Releases: mosaicml/llm-foundry

v0.21.0

31 May 00:57
Compare
Choose a tag to compare

TLDR

  • Torch version has been bumped to 2.7.0
  • Support FSDP2 via a ENV VAR: FSDP_VERSION=2. Currently it only supports pretraining (with meta init). No yaml change is needed to enable FSDP2, the attrs that only apply to FSDP(1) will be ignored and raised as warnings. See composer release for more details

What's Changed

New Contributors

Full Changelog: v0.20.0...v0.21.0

v0.20.0

29 Apr 20:09
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.19.0...v0.20.0

v0.19.0

07 Apr 20:25
Compare
Choose a tag to compare

What's New

1. Python 3.12 Bump (#1755)

We've added support for Python 3.12 and deprecated Python 3.9 support.

What's Changed

New Contributors

Full Changelog: v0.18.0...v0.19.0

v0.18.0

18 Mar 18:31
Compare
Choose a tag to compare

What's Changed

  • Torch has been bumped to 2.6.0 (in #1740)
    • Sparse support has been disabled in the latest megablocks version (as part of the latest torch upgrade) and we cascaded those disables to llm-foundry as well (for more details, view the megablocks release)
  • TransformerEngine has been removed from the all dependency group due to version compatibility issues (in #1742). We expect to add this back in a future release.
  • Transformers has been bumped to v4.49.0 (in #1735) and this would result in the master weights being torch.bfloat16 (view huggingface/transformers#36567 for more context). llm-foundry doesn't support master weights in lower precision, so we manually hardcoded this to torch.float32 when loading in #1734.

Detailed Changes

New Contributors

Full Changelog: v0.17.1...v0.18.0

v0.17.1

21 Feb 22:12
Compare
Choose a tag to compare

What's New

Datasets version upgrade (#1724)

We've upgraded the version of Hugging Face datasets library to include a fix for a common issue of the multiprocessing pool hanging after tokenization or filtering.

What's Changed

Full Changelog: v0.17.0...v0.17.1

v0.17.0

30 Jan 23:53
a8ad4f9
Compare
Choose a tag to compare

What's Changed

  • Update mcli examples to use 0.16.0 by @irenedea in #1713
  • Refactor HF checkpointer by @milocress in #1690
    Previously, MlFlow required PEFT models to be specified as a special "flavor" distinct from Transformers models. This workaround is no longer necessary, allowing us to simplify the codepath and cleanly abstract uploading the HuggingFace checkpoints from registering trained models.
  • Bump version to 0.18.0.dev by @milocress in #1717
    Removes the deprecated sample_weighing_factor argument from mpt loss calculations.

Full Changelog: v0.16.0...v0.17.0

v0.16.0

17 Jan 19:34
Compare
Choose a tag to compare

What's New

Streaming 0.11.0 🚀 (#1711)

We've upgraded streaming to 0.11.0. StreamingDataset can now be used with custom Stream implementations via a registry. See the documentation page for example usage.

What's Changed

Full Changelog: v0.15.1...v0.16.0

v0.15.1

05 Dec 20:59
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.15.0...v0.15.1

v0.15.0

23 Nov 02:13
Compare
Choose a tag to compare

New Features

Open Source Embedding + Contrastive Code (#1615)

LLM foundry now supports finetuning embedding models with contrastive loss. Foundry now supports various approaches to selecting negative passages for contrastive loss which can be either randomly selected or pre-defined. For more information, please view the the readme.

PyTorch 2.5.1 (#1665)

This release updates LLM Foundry to the PyTorch 2.5.1 release, bringing with it support for the new features and optimizations in PyTorch 2.5.1.

Improved error messages (#1657, #1660, #1623, #1625)

Various improved error messages, making debugging user errors more clear.

What's Changed

New Contributors

Full Changelog: v0.14.5...v0.15.0

v0.14.5

18 Nov 17:15
Compare
Choose a tag to compare
  • Move transform_model_pre_registration in hf_checkpointer (#1664)

Full Changelog: v0.14.4...v0.14.5