-
Notifications
You must be signed in to change notification settings - Fork 610
Modernize bf16 cutlass grouped gemm #3889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request was exported from Phabricator. Differential Revision: D71920813 |
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64, mxz297 Differential Revision: D71920813
c197a45
to
b891570
Compare
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64, mxz297 Differential Revision: D71920813
b891570
to
47449ce
Compare
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64, mxz297 Differential Revision: D71920813
47449ce
to
bc3dd8d
Compare
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64, mxz297 Differential Revision: D71920813
bc3dd8d
to
3eadc7c
Compare
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64, mxz297 Differential Revision: D71920813
3eadc7c
to
fba1035
Compare
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64, mxz297 Differential Revision: D71920813
fba1035
to
9c8d29d
Compare
This pull request was exported from Phabricator. Differential Revision: D71920813 |
9c8d29d
to
7b4324e
Compare
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64 Differential Revision: D71920813
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64 Differential Revision: D71920813
7b4324e
to
3e535ec
Compare
This pull request was exported from Phabricator. Differential Revision: D71920813 |
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64 Differential Revision: D71920813
3e535ec
to
48c9d3d
Compare
Summary: Pull Request resolved: pytorch#3889 X-link: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64 Differential Revision: D71920813
This pull request was exported from Phabricator. Differential Revision: D71920813 |
48c9d3d
to
07feea3
Compare
This pull request has been merged in 5e684ba. |
Summary: X-link: pytorch#3889 Pull Request resolved: facebookresearch/FBGEMM#982 This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes. Reviewed By: jiawenliu64 Differential Revision: D71920813 fbshipit-source-id: 4928c4299d2b62e1722faf8b2bc1ba278adf23a1
Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/982
This diff unifies the API between FP8 and BF16 grouped gemm. Specifically we add the same dynamic, concatenated, and stacked APIs that are used for FP8 across both cutlass and CK. After this change, our tests can also be unified into a single grouped gemm test that covers all the various modes.
Reviewed By: jiawenliu64, mxz297
Differential Revision: D71920813