-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Pull requests: axolotl-ai-cloud/axolotl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix: do not call preprocess in multimodal or pretraining case
#2861
opened Jul 3, 2025 by
NanoCode012
Loading…
fix: set add_generation_prompt to False when apply chat template for multimodal
#2859
opened Jul 3, 2025 by
NanoCode012
Loading…
fix: remove unnecessary movement of eval logits to cpu
#2824
opened Jun 23, 2025 by
NanoCode012
Loading…
Enable Memory Efficient Loading when using Deepspeed 3 for Mistral
#2804
opened Jun 18, 2025 by
benHeid
Loading…
[Draft] Token-weighted datasets: Control up/down-sampling of multiple datasets
#2794
opened Jun 16, 2025 by
casper-hansen
•
Draft
feat(mm_chat): enhance multimodal chat collator for audio/text suppor…
hold
don't merge this yet
#2765
opened Jun 5, 2025 by
voidful
Loading…
6 of 9 tasks
Add StableMax integration to enable grokking and prevent Softmax Collapse
#2761
opened Jun 5, 2025 by
ehartford
Loading…
Make De-duplication Multi-threaded and Happen Only During Pre-processing
#2747
opened Jun 1, 2025 by
xzuyn
Loading…
Create base docker images for CUDA 12.8 with custom FlashAttention 3 installed
#2685
opened May 16, 2025 by
winglian
Loading…
User-agent on CI snapshot download
hold
don't merge this yet
#2665
opened May 12, 2025 by
winglian
Loading…
Implement configurable handling of excess tokens in datasets
#2662
opened May 12, 2025 by
mhenrichsen
Loading…
setup defaults for dataloader to ensure GPU is kept busy
#2632
opened May 5, 2025 by
winglian
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.