Skip to content

Commit 95f5ba9

Browse files
YUNQIUGUOfacebook-github-bot
authored andcommitted
Fix f8f8bf16_lite quantize op input in quantize_and_compute (pytorch#745)
Summary: X-link: pytorch#3667 Pull Request resolved: facebookresearch/FBGEMM#745 A minor fix for trt-llm cudaCoreGemm `cuda_lite` op in quantize_bench script. when testing with `--bench_quantize` detected a failure with input ``` ... tree/deeplearning/fbgemm/fbgemm_gpu/experimental/gen_ai/bench/quantize_ops.py", line 797, in quantize_and_compute return self.compute(xq, wq, x_scale * w_scale) TypeError: FP8LiteGemm.compute() missing 1 required positional argument: 'w_scale' ``` Reviewed By: jwfromm Differential Revision: D69272912 fbshipit-source-id: c184954b4d2d1543277a9e56ac899534597a56e6
1 parent 9182eb0 commit 95f5ba9

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

fbgemm_gpu/experimental/gen_ai/bench/quantize_ops.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -719,7 +719,7 @@ def compute(self, xq, wq, x_scale, w_scale):
719719

720720
def quantize_and_compute(self, x, w):
721721
xq, wq, x_scale, w_scale = self.quantize(x, w)
722-
return self.compute(xq, wq, x_scale * w_scale)
722+
return self.compute(xq, wq, x_scale, w_scale)
723723

724724
@property
725725
def name(self) -> str:

0 commit comments

Comments
 (0)