Skip to content

Pull requests: HabanaAI/vllm-hpu-extension

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[UX] Warning for exponential bucketing
#262 opened Jul 8, 2025 by adobrzyn Loading…
Remove self_attn & lm_head from Mixtral quant config
#261 opened Jul 7, 2025 by tpawlows Loading…
Update of exponential bs warmup mechanism
#258 opened Jul 4, 2025 by iboiko-habana Loading…
Fix ModuleFusedSDPA graph break
#257 opened Jul 4, 2025 by bkowalskiINTEL Loading…
[SW-233526]Fix runtime dequant for block fp8
#251 opened Jul 1, 2025 by xuechendi Loading…
Automatization of long context
#248 opened Jun 30, 2025 by iboiko-habana Loading…
Add pre-commit static checks
#247 opened Jun 30, 2025 by kzawora-intel Loading…
Update dependabot.yml
#242 opened Jun 26, 2025 by michalkuligowski Loading…
Update linear.py
#239 opened Jun 25, 2025 by michalkuligowski Loading…
Integrating block_softmax
#238 opened Jun 24, 2025 by ksmusz Draft
Remove double generate
#229 opened Jun 18, 2025 by adobrzyn Loading…
Exponential bucketing tweaks
#224 opened Jun 13, 2025 by madamczyk-intel Loading…
Find bucket with bmin not divs by step
#212 opened Jun 5, 2025 by adobrzyn Loading…
Add useful internal vllm test
#200 opened May 27, 2025 by nirda7 Draft
fix the issue that bmax not in bucket buffer
#191 opened May 22, 2025 by sywangyi Loading…
Optimized MoE on Gaudi
#159 opened Apr 18, 2025 by gyou2021 Draft
ProTip! Exclude everything labeled bug with -label:bug.