Skip to content

[CUTLASS] Use custom copy of cutlass to enable FP8 Grouped Gemm. #3649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jwfromm
Copy link
Contributor

@jwfromm jwfromm commented Jan 31, 2025

To support MOE models, we need to enable FP8 rowwise grouped gemm in FBGEMM. One missing piece to do this is support for rowwise scaling in cutlass. We have enabled this feature in a custom copy of cutlass but getting it into mainline will take a while. As a temporary measure, we can point FBGEMM to a custom copy of cutlass instead. Once the feature lands in mainline and we bump our support, we can go back to using the main repo.

Copy link

netlify bot commented Jan 31, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 5e03cb9
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/679d1d7f53df340008917aad
😎 Deploy Preview https://deploy-preview-3649--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@facebook-github-bot
Copy link
Contributor

@jwfromm has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jwfromm jwfromm requested a review from q10 January 31, 2025 19:00
@jiawenliu64 jiawenliu64 self-requested a review January 31, 2025 19:04
@facebook-github-bot
Copy link
Contributor

@jwfromm merged this pull request in 0b2c24f.

avbokovoy pushed a commit to ROCm/FBGEMM that referenced this pull request Feb 14, 2025
Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/726

To support MOE models, we need to enable FP8 rowwise grouped gemm in FBGEMM. One missing piece to do this is support for rowwise scaling in cutlass. We have enabled this feature in a custom copy of cutlass but getting it into mainline will take a while. As a temporary measure, we can point FBGEMM to a custom copy of cutlass instead. Once the feature lands in mainline and we bump our support, we can go back to using the main repo.

Pull Request resolved: pytorch#3649

Reviewed By: q10, jiawenliu64

Differential Revision: D68967944

Pulled By: jwfromm

fbshipit-source-id: 3e4625227ba6c33cf0478811fc9a8d40af361612
q10 pushed a commit to q10/FBGEMM that referenced this pull request Apr 10, 2025
Summary:
Pull Request resolved: facebookresearch/FBGEMM#726

To support MOE models, we need to enable FP8 rowwise grouped gemm in FBGEMM. One missing piece to do this is support for rowwise scaling in cutlass. We have enabled this feature in a custom copy of cutlass but getting it into mainline will take a while. As a temporary measure, we can point FBGEMM to a custom copy of cutlass instead. Once the feature lands in mainline and we bump our support, we can go back to using the main repo.

X-link: pytorch#3649

Reviewed By: q10, jiawenliu64

Differential Revision: D68967944

Pulled By: jwfromm

fbshipit-source-id: 3e4625227ba6c33cf0478811fc9a8d40af361612
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants