Skip to content

Commit 35f5175

Browse files
zou3519jimoosciuc
authored andcommitted
Fix deepseek-v3 with torch.compile in PyTorch 2.6. (sgl-project#5213)
1 parent f497e33 commit 35f5175

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

sgl-kernel/csrc/common_extension.cc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,7 +177,8 @@ TORCH_LIBRARY_FRAGMENT(sgl_kernel, m) {
177177
*/
178178
m.def(
179179
"bmm_fp8(Tensor A, Tensor B, Tensor! D, Tensor A_scale, Tensor B_scale, Tensor workspace_buffer, int "
180-
"cublas_handle, int cuda_stream) -> ()");
180+
"cublas_handle, int cuda_stream) -> ()",
181+
{at::Tag::needs_fixed_stride_order});
181182
m.impl("bmm_fp8", torch::kCUDA, &bmm_fp8);
182183

183184
m.def(

0 commit comments

Comments
 (0)