Releases · mosaicml/llm-foundry

31 May 00:57

bowenyang008

v0.21.0

ecf6033

v0.21.0 Latest

Latest

TLDR

Torch version has been bumped to 2.7.0
Support FSDP2 via a ENV VAR: FSDP_VERSION=2. Currently it only supports pretraining (with meta init). No yaml change is needed to enable FSDP2, the attrs that only apply to FSDP(1) will be ignored and raised as warnings. See composer release for more details

What's Changed

Adding support for nope positional encoding in block overrides. by @ShashankMosaicML in #1794
Bump foundry version to 0.21.0.dev0 by @dakinggg in #1812
Adding temperature tuning in attention by @ShashankMosaicML in #1793
Update foundry version in MCLI yamls by @dakinggg in #1813
Upgrade yapf version by @dakinggg in #1814
Allow subselecting the appropriate config for llama4 by @dakinggg in #1815
Change RMSNorm to use PyTorch native implementation by @josejg in #1809
Update datasets requirement from <3.6,>=3.3.2 to >=3.3.2,<3.7 by @dependabot in #1817
Bump onnxruntime from 1.19.2 to 1.22.0 by @dependabot in #1819
Update huggingface-hub[hf_xet] requirement from <0.31,>=0.30.0 to >=0.30.0,<0.32 by @dependabot in #1818
Deprecate inference API wrappers by @dakinggg in #1821
Fix Dtensor initialization by @bowenyang008 in #1820
Update accelerate requirement from <1.7,>=0.25 to >=0.25,<1.8 by @dependabot in #1824
Bump onnx from 1.17.0 to 1.18.0 by @dependabot in #1823
Bump docformatter for python3.12 and change blank_line_before_module_docstring = false by @sashaDoubov in #1825
Delete useless print("here") by @tsebaka in #1826
Update ci-testing version to latest by @dakinggg in #1827
Bump coverage[toml] from 7.8.0 to 7.8.2 by @dependabot in #1830
Configurable shard size by @dakinggg in #1833
Bump Composer 0.31.0 by @bowenyang008 in #1835
Fix monolithic checkpointing against composer main by @dakinggg in #1836
Bump torch version to 2.7 by @bowenyang008 in #1832
bump huggingface-hub upper bound to 0.33 by @bowenyang008 in #1838

New Contributors

@bowenyang008 made their first contribution in #1820
@tsebaka made their first contribution in #1826

Full Changelog: v0.20.0...v0.21.0

Contributors

sashaDoubov, josejg, and 5 other contributors

Assets 2

29 Apr 20:09

dakinggg

v0.20.0

643140d

v0.20.0

What's Changed

Bump Dev 0.20.0.dev0 by @KuuCi in #1778
Bump Example Yamls to use 0.19.0 by @KuuCi in #1779
Making tokenizers optional in the building of LLMs by @ethantang-db in #1781
Remove some more calls to HF during CI by @dakinggg in #1780
Modify validation check for multimodal messages by @adyasha-db in #1787
Remove all connection to HF in CI by @dakinggg in #1786
Update transformers requirement from <4.50,>=v4.49.0 to >=v4.49.0,<4.52 by @dependabot in #1788
Bump einops from 0.8.0 to 0.8.1 by @dependabot in #1776
Bump gitpython from 3.1.43 to 3.1.44 by @dependabot in #1775
Update transformers to 4.51 by @dakinggg in #1790
Update setuptools requirement from <78.0.0 to <80.0.0 by @dependabot in #1796
Update tiktoken requirement from <0.8.1,>=0.4 to >=0.4,<0.9.1 by @dependabot in #1797
Update packaging requirement from <25,>=21 to >=21,<26 by @dependabot in #1800
Update accelerate requirement from <1.4,>=0.25 to >=0.25,<1.7 by @dependabot in #1799
extended hf_checkpointer for any additional content saving by @ethantang-db in #1792
Load model only on global rank 0 for mixed init by @dakinggg in #1795
added attn_implementation for hf_base.py by @ethantang-db in #1801
Update setuptools requirement from <80.0.0 to <81.0.0 by @dependabot in #1803
Update datasets requirement from <3.4,>=3.3.2 to >=3.3.2,<3.6 by @dependabot in #1807
Update grouped-gemm version by @dakinggg in #1810
Remove some old deprecated code/comments by @dakinggg in #1811

New Contributors

@ethantang-db made their first contribution in #1781
@adyasha-db made their first contribution in #1787

Full Changelog: v0.19.0...v0.20.0

Contributors

dependabot, dakinggg, and 3 other contributors

Assets 2

07 Apr 20:25

KuuCi

v0.19.0

7019de0

v0.19.0

What's New

1. Python 3.12 Bump (#1755)

We've added support for Python 3.12 and deprecated Python 3.9 support.

What's Changed

Use llmfoundry image instead of pytorch image for gpu tests by @rithwik-db in #1752
bump dev version to 0.19.0.dev0 by @rithwik-db in #1753
Bump mcli yaml examples to use 0.18.0 and torch 2.6 by @rithwik-db in #1754
Fix meta initialization for FSDP training with HF models and TE Layers by @jjuvonen-amd in #1745
Fix bugs in llmfoundry/data/text_data.py by @gsganden in #1760
Update setuptools requirement from <76.0.0 to <78.0.0 by @dependabot in #1758
Update README.md by @gsganden in #1721
Add error handling for general table download errors by @dakinggg in #1761
modified the packing slightly to enable inheritance by @abaheti95 in #1762
Remove registration fallback by @dakinggg in #1764
Move save/load planner creation to after config logging by @dakinggg in #1769
Bump Python 3.12 by @KuuCi in #1755
Fix GPU Tests 3.10 by @KuuCi in #1770
Remove a bunch of repeated calls to HF in the tests by @dakinggg in #1768
Bump coverage[toml] from 7.6.10 to 7.8.0 by @dependabot in #1767
Update mlflow requirement from <2.19,>=2.14.1 to >=2.14.1,<2.22 by @dependabot in #1766
Bump Composer 0.30.0 by @KuuCi in #1772
Bump streaming 0.12.0 by @KuuCi in #1777

New Contributors

@jjuvonen-amd made their first contribution in #1745
@gsganden made their first contribution in #1760
@abaheti95 made their first contribution in #1762

Full Changelog: v0.18.0...v0.19.0

Contributors

abaheti95, gsganden, and 5 other contributors

Assets 2

18 Mar 18:31

rithwik-db

v0.18.0

b247051

v0.18.0

What's Changed

Torch has been bumped to 2.6.0 (in #1740)
- Sparse support has been disabled in the latest megablocks version (as part of the latest torch upgrade) and we cascaded those disables to llm-foundry as well (for more details, view the megablocks release)
TransformerEngine has been removed from the all dependency group due to version compatibility issues (in #1742). We expect to add this back in a future release.
Transformers has been bumped to v4.49.0 (in #1735) and this would result in the master weights being torch.bfloat16 (view huggingface/transformers#36567 for more context). llm-foundry doesn't support master weights in lower precision, so we manually hardcoded this to torch.float32 when loading in #1734.

Detailed Changes

remove deprecated param by @bigning in #1727
Bump TE for FA 2.7.1.post1 bump by @KuuCi in #1730
Fix dtype issue in transformers by @dakinggg in #1734
Bump composer to 0.29.0 by @rithwik-db in #1733
Bump Transformer v4.49.0 by @KuuCi in #1735
Bump FA2 to 2.7.4.post1 by @KuuCi in #1728
Comment GHCR Image Upload by @KuuCi in #1739
Remove TE from all dependency group by @dakinggg in #1742
Bump torch to 2.6 by @rithwik-db in #1740
Update Makefile to use WORLD_SIZE by @irenedea in #1751

New Contributors

@rithwik-db made their first contribution in #1733

Full Changelog: v0.17.1...v0.18.0

Contributors

bigning, irenedea, and 3 other contributors

Assets 2

21 Feb 22:12

dakinggg

v0.17.1

611b90b

v0.17.1

What's New

Datasets version upgrade (#1724)

We've upgraded the version of Hugging Face datasets library to include a fix for a common issue of the multiprocessing pool hanging after tokenization or filtering.

What's Changed

Update accelerate requirement from <1.2,>=0.25 to >=0.25,<1.4 by @dependabot in #1714
Bump datasets version by @dakinggg in #1724

Full Changelog: v0.17.0...v0.17.1

Contributors

dependabot and dakinggg

Assets 2

30 Jan 23:53

milocress

v0.17.0

a8ad4f9

v0.17.0

What's Changed

Update mcli examples to use 0.16.0 by @irenedea in #1713
Refactor HF checkpointer by @milocress in #1690
Previously, MlFlow required PEFT models to be specified as a special "flavor" distinct from Transformers models. This workaround is no longer necessary, allowing us to simplify the codepath and cleanly abstract uploading the HuggingFace checkpoints from registering trained models.
Bump version to 0.18.0.dev by @milocress in #1717
Removes the deprecated sample_weighing_factor argument from mpt loss calculations.

Full Changelog: v0.16.0...v0.17.0

Contributors

irenedea and milocress

Assets 2

17 Jan 19:34

irenedea

v0.16.0

a64a7d7

v0.16.0

What's New

Streaming 0.11.0 🚀 (#1711)

We've upgraded streaming to 0.11.0. StreamingDataset can now be used with custom Stream implementations via a registry. See the documentation page for example usage.

What's Changed

Fix llama3 example yamls by @j316chuck in #1688
Update example yamls to use newest foundry version by @snarayan21 in #1689
Update datasets requirement from <2.21,>=2.20.0 to >=2.20.0,<3.2 by @dependabot in #1670
Catch multiple slashes in source dataset into one slash by @KuuCi in #1697
Make loaded peft adapters optionally trainable by @snarayan21 in #1701
Adding preprocessors for QA and messages datasets by @ShashankMosaicML in #1700
Update pycln by @b-chu in #1704
Add permission error by @b-chu in #1703
Update datasets requirement from <3.2,>=2.20.0 to >=2.20.0,<3.3 by @dependabot in #1698
Bump coverage[toml] from 7.6.4 to 7.6.10 by @dependabot in #1702
Update mosaicml-streaming to 0.11.0 by @es94129 in #1711
Bump version to 0.17.0.dev0 by @irenedea in #1712

Full Changelog: v0.15.1...v0.16.0

Contributors

es94129, j316chuck, and 6 other contributors

Assets 2

05 Dec 20:59

snarayan21

v0.15.1

83f12ee

v0.15.1

What's Changed

Bump version 0.16.0.dev0 by @j316chuck in #1667
Update mlflow requirement from <2.18,>=2.14.1 to >=2.14.1,<2.19 by @dependabot in #1673
Speed up embedding tests by @dakinggg in #1668
Add mcli yaml version bump by @j316chuck in #1674
Bump Openai version by @snarayan21 in #1684
Bump Streaming to v0.10.0 by @snarayan21 in #1685
Bugfix auto packing with streams + no remote path by @mattyding in #1679
Bump Composer to v0.28.0 by @snarayan21 in #1687
Expose DistributedSampler RNG seed argument by @janEbert in #1677
Add llama3 ft example yamls by @j316chuck in #1686

New Contributors

@janEbert made their first contribution in #1677

Full Changelog: v0.15.0...v0.15.1

Contributors

janEbert, j316chuck, and 4 other contributors

Assets 2

23 Nov 02:13

j316chuck

v0.15.0

8982b2c

v0.15.0

New Features

Open Source Embedding + Contrastive Code (#1615)

LLM foundry now supports finetuning embedding models with contrastive loss. Foundry now supports various approaches to selecting negative passages for contrastive loss which can be either randomly selected or pre-defined. For more information, please view the the readme.

PyTorch 2.5.1 (#1665)

This release updates LLM Foundry to the PyTorch 2.5.1 release, bringing with it support for the new features and optimizations in PyTorch 2.5.1.

Improved error messages (#1657, #1660, #1623, #1625)

Various improved error messages, making debugging user errors more clear.

What's Changed

Update mcli examples to use 0.14.0 by @irenedea in #1624
Open Source Embedding + Contrastive Code by @KuuCi in #1615
Catch delta table not found error by @milocress in #1625
Add Mlflow 403 PL UserError by @mattyding in #1623
Catches when data prep cluster fails to start by @milocress in #1628
Bump mlflow max version by @dakinggg in #1629
add another cluster connection failure wrapper by @milocress in #1630
Add MLflow log_model option by @nancyhung in #1544
Move loss generating token counting to the dataloader by @dakinggg in #1632
Bump databricks-connect from 14.1.0 to 15.4.3 by @dependabot in #1636
Fix dataset download location by @dakinggg in #1639
Revert "Bump databricks-connect from 14.1.0 to 15.4.3" by @XiaohanZhangCMU in #1640
Bump transformers version by @dakinggg in #1631
Fix gpu tests test_tp_train and test_huggingface_conversion_callback_interval by @irenedea in #1642
Update datasets requirement from <2.20,>=2.19 to >=2.20.0,<2.21 by @dependabot in #1330
Add max shard size to transformers save_pretrained by @b-chu in #1648
Update huggingface-hub requirement from <0.25,>=0.19.0 to >=0.19.0,<0.27 by @dependabot in #1652
Update accelerate requirement from <0.34,>=0.25 to >=0.25,<1.2 by @dependabot in #1633
Catch Delta Table Not Found by @KuuCi in #1653
Add Exception for missing UC column by @milocress in #1654
Infer step size for Embeddings by @KuuCi in #1647
Pin FAv2 by @mvpatel2000 in #1656
Retry catching BlockingIOError by @KuuCi in #1657
Catch bad data prep by @milocress in #1644
Update pytest-cov requirement from <6,>=4 to >=4,<7 by @dependabot in #1663
Bump coverage[toml] from 7.6.1 to 7.6.4 by @dependabot in #1650
Move transform_model_pre_registration in hf_checkpointer by @irenedea in #1664
Catch Cluster Permissions Error by @KuuCi in #1660
Mosaicml version bump by @j316chuck in #1661
Changes for removing unused terms in CE loss fn by @gupta-abhay in #1643
Update setuptools requirement from <68.0.0 to <76.0.0 by @dependabot in #1662
Bump docker version to torch 2.5.1 by @j316chuck in #1665
Bump ubuntu 22.04 + torch 2.5.1 by @KuuCi in #1666

New Contributors

@mattyding made their first contribution in #1623

Full Changelog: v0.14.5...v0.15.0

Contributors

gupta-abhay, j316chuck, and 10 other contributors

Assets 2

18 Nov 17:15

irenedea

v0.14.5

aa5e1b9

v0.14.5

Move transform_model_pre_registration in hf_checkpointer (#1664)

Full Changelog: v0.14.4...v0.14.5

Assets 2

Releases: mosaicml/llm-foundry

v0.21.0

TLDR

What's Changed

New Contributors

Contributors

Uh oh!

v0.20.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.19.0

What's New

1. Python 3.12 Bump (#1755)

What's Changed

New Contributors

Contributors

Uh oh!

v0.18.0

What's Changed

Detailed Changes

New Contributors

Contributors

Uh oh!

v0.17.1

What's New

Datasets version upgrade (#1724)

What's Changed

Contributors

Uh oh!

v0.17.0

What's Changed

Contributors

Uh oh!

v0.16.0

What's New

Streaming 0.11.0 🚀 (#1711)

What's Changed

Contributors

Uh oh!

v0.15.1

What's Changed

New Contributors

Contributors

Uh oh!

v0.15.0

New Features

Open Source Embedding + Contrastive Code (#1615)

PyTorch 2.5.1 (#1665)

Improved error messages (#1657, #1660, #1623, #1625)

What's Changed

New Contributors

Contributors

Uh oh!

v0.14.5

Uh oh!