-
Notifications
You must be signed in to change notification settings - Fork 35
Pull requests: HabanaAI/vllm-hpu-extension
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix for calibration error TypeError: generate_responses() missing 1 required positional argument: 'args'
#255
opened Jul 2, 2025 by
tthakkal
Loading…
vllm hpu-extension for automatization of long context, prompt
#249
opened Jun 30, 2025 by
iboiko-habana
Loading…
Allow usage of fused_block_softmax_adjustment for Qwen with Lazy
#246
opened Jun 27, 2025 by
mswiniarsk
•
Draft
Fix max_blocks for warmup decode buckets in case of disabled CONTIGUOUS PA feature
#204
opened May 29, 2025 by
iboiko-habana
Loading…
Use sets for faster filter checks. Better long context support
#203
opened May 28, 2025 by
pi314ever
Loading…
[SW-225565] Enable triangular softmax with merged prefill
#197
opened May 26, 2025 by
kamil-kaczor
•
Draft
[SW-233624] Unify FusedMoe with expert parallelism
#175
opened May 14, 2025 by
mengniwang95
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.