Provide helper functions for int4 quantization #3775

jwfromm · 2025-03-06T03:41:33Z

Summary: This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights.

Differential Revision: D70643388

facebook-github-bot · 2025-03-06T03:41:44Z

This pull request was exported from Phabricator. Differential Revision: D70643388

netlify · 2025-03-06T03:41:53Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`75124a2`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/67d48c0b21268200081e1426
😎 Deploy Preview	https://deploy-preview-3775--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

facebook-github-bot · 2025-03-06T18:30:13Z

This pull request was exported from Phabricator. Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

facebook-github-bot · 2025-03-07T17:34:26Z

This pull request was exported from Phabricator. Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

facebook-github-bot · 2025-03-07T17:41:23Z

This pull request was exported from Phabricator. Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

facebook-github-bot · 2025-03-12T23:03:53Z

This pull request was exported from Phabricator. Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

facebook-github-bot · 2025-03-12T23:17:29Z

This pull request was exported from Phabricator. Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Differential Revision: D70643388 Reviewed By: summerdengfb

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

facebook-github-bot · 2025-03-13T22:59:31Z

This pull request was exported from Phabricator. Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Differential Revision: D70643388 Reviewed By: summerdengfb

Summary: X-link: facebookresearch/FBGEMM#847 One of the new interesting changes in the preshuffled F8I4 kernel is that group scales are downcast to FP8. This has the risk of running into dynamic range issues and impacting accuracy. We can mitigate this risk by adding FP32 columnwise scaling to the output. Fortunately, we can do this using EVT so the performance impact is negligible. Reviewed By: jiawenliu64 Differential Revision: D70587477

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

facebook-github-bot · 2025-03-14T20:05:35Z

This pull request was exported from Phabricator. Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388

Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Differential Revision: D70643388 Reviewed By: summerdengfb

facebook-github-bot · 2025-03-17T05:49:58Z

This pull request has been merged in c8ee354.

Summary: X-link: https://github.com/facebookresearch/FBGEMM/pull/855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388 fbshipit-source-id: 25d1896ead1918707c056e8cfb0d02bbdd24ecd1

Summary: Pull Request resolved: facebookresearch/FBGEMM#855 X-link: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388 fbshipit-source-id: 25d1896ead1918707c056e8cfb0d02bbdd24ecd1

facebook-github-bot added the cla signed label Mar 6, 2025

facebook-github-bot added the fb-exported label Mar 6, 2025

jwfromm force-pushed the export-D70643388 branch from d4fd535 to 6fab18a Compare March 6, 2025 18:30

jwfromm force-pushed the export-D70643388 branch from 6fab18a to 24e123e Compare March 7, 2025 17:34

jwfromm force-pushed the export-D70643388 branch from 24e123e to d9e4e3f Compare March 7, 2025 17:34

jwfromm force-pushed the export-D70643388 branch from d9e4e3f to afe0b73 Compare March 7, 2025 17:41

jwfromm force-pushed the export-D70643388 branch from afe0b73 to 4878396 Compare March 12, 2025 23:00

jwfromm force-pushed the export-D70643388 branch from 4878396 to 9a91b31 Compare March 12, 2025 23:00

jwfromm force-pushed the export-D70643388 branch from 9a91b31 to 3e47440 Compare March 12, 2025 23:03

jwfromm force-pushed the export-D70643388 branch from 3e47440 to b33cae8 Compare March 12, 2025 23:17

jwfromm force-pushed the export-D70643388 branch from b33cae8 to 77c4b3a Compare March 13, 2025 22:59

jwfromm added 2 commits March 14, 2025 13:05

jwfromm force-pushed the export-D70643388 branch from 77c4b3a to 75124a2 Compare March 14, 2025 20:05

facebook-github-bot closed this in c8ee354 Mar 17, 2025

facebook-github-bot added the Merged label Mar 17, 2025

q10 added category:improvement feature:quantize labels Mar 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Provide helper functions for int4 quantization #3775

Provide helper functions for int4 quantization #3775

Uh oh!

jwfromm commented Mar 6, 2025

Uh oh!

facebook-github-bot commented Mar 6, 2025

Uh oh!

netlify bot commented Mar 6, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Mar 6, 2025

Uh oh!

facebook-github-bot commented Mar 7, 2025

Uh oh!

facebook-github-bot commented Mar 7, 2025

Uh oh!

facebook-github-bot commented Mar 12, 2025

Uh oh!

facebook-github-bot commented Mar 12, 2025

Uh oh!

facebook-github-bot commented Mar 13, 2025

Uh oh!

facebook-github-bot commented Mar 14, 2025

Uh oh!

facebook-github-bot commented Mar 17, 2025

Uh oh!

Uh oh!

Provide helper functions for int4 quantization #3775

Provide helper functions for int4 quantization #3775

Uh oh!

Conversation

jwfromm commented Mar 6, 2025

Uh oh!

facebook-github-bot commented Mar 6, 2025

Uh oh!

netlify bot commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Uh oh!

facebook-github-bot commented Mar 6, 2025

Uh oh!

facebook-github-bot commented Mar 7, 2025

Uh oh!

facebook-github-bot commented Mar 7, 2025

Uh oh!

facebook-github-bot commented Mar 12, 2025

Uh oh!

facebook-github-bot commented Mar 12, 2025

Uh oh!

facebook-github-bot commented Mar 13, 2025

Uh oh!

facebook-github-bot commented Mar 14, 2025

Uh oh!

facebook-github-bot commented Mar 17, 2025

Uh oh!

Uh oh!

netlify bot commented Mar 6, 2025 •

edited

Loading