You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enable rowwise scaling for DeepGemm (pytorch#3874)
Summary:
X-link: facebookresearch/FBGEMM#964
Pull Request resolved: pytorch#3874
This diff adds [ngimel's support for DeepGemm rowwise scaling](https://github.com/ngimel/DeepGEMM/tree/rowwise) to our fbcode copy. It also includes a few deepgemm updates that allow operation on M<128, which is important for any real use case. Performance is increased considerably by the use of rowwise scaling, especially in memory bound cases. Notably, this makes DeepGemm the premier solution for slow accumulation as it now overall outperforms cublas + unfused rowwise scaling.
{F1976375307}
Reviewed By: jianyuh
Differential Revision: D71748927
fbshipit-source-id: 87e287a2cec284bd8fd7c5e80603065a0d662f53
0 commit comments