-
Notifications
You must be signed in to change notification settings - Fork 610
Provide helper functions for int4 quantization #3775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request was exported from Phabricator. Differential Revision: D70643388 |
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
d4fd535
to
6fab18a
Compare
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
This pull request was exported from Phabricator. Differential Revision: D70643388 |
6fab18a
to
24e123e
Compare
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
This pull request was exported from Phabricator. Differential Revision: D70643388 |
24e123e
to
d9e4e3f
Compare
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
This pull request was exported from Phabricator. Differential Revision: D70643388 |
Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
d9e4e3f
to
afe0b73
Compare
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
afe0b73
to
4878396
Compare
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
4878396
to
9a91b31
Compare
This pull request was exported from Phabricator. Differential Revision: D70643388 |
9a91b31
to
3e47440
Compare
Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
This pull request was exported from Phabricator. Differential Revision: D70643388 |
Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
3e47440
to
b33cae8
Compare
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Differential Revision: D70643388 Reviewed By: summerdengfb
b33cae8
to
77c4b3a
Compare
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
This pull request was exported from Phabricator. Differential Revision: D70643388 |
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Differential Revision: D70643388 Reviewed By: summerdengfb
Summary: X-link: facebookresearch/FBGEMM#847 One of the new interesting changes in the preshuffled F8I4 kernel is that group scales are downcast to FP8. This has the risk of running into dynamic range issues and impacting accuracy. We can mitigate this risk by adding FP32 columnwise scaling to the output. Fortunately, we can do this using EVT so the performance impact is negligible. Reviewed By: jiawenliu64 Differential Revision: D70587477
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
77c4b3a
to
75124a2
Compare
This pull request was exported from Phabricator. Differential Revision: D70643388 |
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388
Summary: X-link: facebookresearch/FBGEMM#855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Differential Revision: D70643388 Reviewed By: summerdengfb
This pull request has been merged in c8ee354. |
Summary: X-link: https://github.com/facebookresearch/FBGEMM/pull/855 Pull Request resolved: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388 fbshipit-source-id: 25d1896ead1918707c056e8cfb0d02bbdd24ecd1
Summary: Pull Request resolved: facebookresearch/FBGEMM#855 X-link: pytorch#3775 This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights. Reviewed By: summerdengfb Differential Revision: D70643388 fbshipit-source-id: 25d1896ead1918707c056e8cfb0d02bbdd24ecd1
Summary: This diff introduces a set of quantization helper functions to fbgemm_gpu/experimental/gen_ai to make it easier to apply the new Int4 packing and preshuffling to weights.
Differential Revision: D70643388