Skip to content

FP8 tensorwise GEMM improvement #2585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jiawenliu64
Copy link
Member

@jiawenliu64 jiawenliu64 commented May 13, 2024

As title

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57263833

Copy link

netlify bot commented May 13, 2024

Deploy Preview for pytorch-fbgemm-docs failed.

Name Link
🔨 Latest commit 5ce783e
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66424d8e2458450008d6eda0

Summary:

This Diff improves FP8 tensorwise GEMM performance with scalar scale broadcasting along with EVT
- **FP8 CUTLASS tensorwise is 15% faster than FP8 CUTLASS rowwise GEMM on average (up to 2.7x faster)**
- Before this Diff, FP8 tensorwise CUTLASS GEMM is similar to FP8 rowwise
- FP8 tensorwise would be useful in models that are not very sensitive to numeric variance, while require latency/throughput boost (e.g., LLM with 7B, LDM,  etc)

- More data can be found in [this data sheet](https://docs.google.com/spreadsheets/d/1SYSjYqWeESasl9LII-qHLHMvaNAXlV5wmCV9BWIrKBc/edit?usp=sharing) 

 {F1636238658} 



TODO
1. Merge two FP8 tensorwise GEMMs into one
2. Support e5m2 for bwd and bias

Differential Revision: D57263833
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57263833

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 17a4e18.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants