Fp8 gemm tweak for 405B Decoding #3104

zjing14 · 2024-09-10T01:56:25Z

Summary:
Improve fp8 gemm for memory-bound cases from 405B decoding

improved 405B decoding GEMMs:
- [1, 13312, 16384] from 74us to 71us
- [1, 16384, 6656] from 34us to 31us
Adjust tuning parameters that improves memory load performance
Adjust pipeline for overlapping scaling and gemm

Differential Revision: D62363038

Summary: Improve fp8 gemm for memory-bound cases from 405B decoding - improved 405B decoding GEMMs: - [1, 13312, 16384] from 74us to 71us - [1, 16384, 6656] from 34us to 31us - Adjust tuning parameters that improves memory load performance - Adjust pipeline for overlapping scaling and gemm Differential Revision: D62363038

netlify · 2024-09-10T01:56:41Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`ce0e85a`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66dfa74b2f3c800008424bf4
😎 Deploy Preview	https://deploy-preview-3104--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

facebook-github-bot · 2024-09-10T01:56:44Z

This pull request was exported from Phabricator. Differential Revision: D62363038

facebook-github-bot · 2024-09-11T13:29:19Z

This pull request has been merged in 04d3c58.

Summary: Pull Request resolved: facebookresearch/FBGEMM#192 X-link: pytorch#3104 Improve fp8 gemm for memory-bound cases from 405B decoding - improved 405B decoding GEMMs: - [1, 13312, 16384] from 74us to 71us - [1, 16384, 6656] from 34us to 31us - Adjust tuning parameters that improves memory load performance - Adjust pipeline for overlapping scaling and gemm Reviewed By: jianyuh, jwfromm Differential Revision: D62363038 fbshipit-source-id: 4beea7eb8c66605539f9dd675a61ba3fe8870a3f

facebook-github-bot added the cla signed label Sep 10, 2024

facebook-github-bot added the fb-exported label Sep 10, 2024

facebook-github-bot closed this in 04d3c58 Sep 11, 2024

facebook-github-bot added the Merged label Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fp8 gemm tweak for 405B Decoding #3104

Fp8 gemm tweak for 405B Decoding #3104

Uh oh!

zjing14 commented Sep 10, 2024

Uh oh!

netlify bot commented Sep 10, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 10, 2024

Uh oh!

facebook-github-bot commented Sep 11, 2024

Uh oh!

Uh oh!

Fp8 gemm tweak for 405B Decoding #3104

Fp8 gemm tweak for 405B Decoding #3104

Uh oh!

Conversation

zjing14 commented Sep 10, 2024

Uh oh!

netlify bot commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Uh oh!

facebook-github-bot commented Sep 10, 2024

Uh oh!

facebook-github-bot commented Sep 11, 2024

Uh oh!

Uh oh!

netlify bot commented Sep 10, 2024 •

edited

Loading