MLSys Learning Notes

Learn You a MLSys for Great Good (in public)!

GPU Kernels

Your GPU is a Monster. Don't Let It Starve -> Max out every part of the GPU (a napkin math first)
A High-Level Overview of LLM Systems -> A broad overview without getting lost in technical details
Writing Your First CUDA Kernel -> Introduction to GPU programming with a simple CUDA kernel
The Art of Pointer Arithmetic -> The underlying memory layout of tensor representiaobn
Tiling and Shared Memory -> Dividing the matrix into blocks that fit within the cache
Global Memory Coalescing -> Combining adjacent accesses into single memory transaction

RL Training Frameworks

RL in LLM Post-training -> This is the way LLMs can do reasoning
RL Framework Design Space -> Discuss RL infra form factor
Setting Up RL Infra -> The logs of me playing Slime
Notes from VeRL Talk -> A write-up of Haibin Lin’s introduction and Q&A on VeRL at PyTorch Webinar

Low-precision Data Type

Don't Just .cast() -> The note of learning from video: mxfp8, mxfp4, nvfp4 formats and applications in PyTorch - Vasily Kuznetsov & Driss Guessous, Meta
The Missing 10 Bits -> The note of learning from blog: Some Matrix Multiplication Engines Are Not As Accurate As We Thought

Speculative Decoding

Speculative Decoding: Part 1

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
assets		assets
README.md		README.md
day-1.md		day-1.md
day-10.md		day-10.md
day-11.md		day-11.md
day-12.md		day-12.md
day-15.md		day-15.md
day-2.md		day-2.md
day-3.md		day-3.md
day-4.md		day-4.md
day-5.md		day-5.md
day-6.md		day-6.md
day-7.md		day-7.md
day-8.md		day-8.md
day-9.md		day-9.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLSys Learning Notes

Learn You a MLSys for Great Good (in public)!

GPU Kernels

RL Training Frameworks

Low-precision Data Type

Speculative Decoding

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MLSys Learning Notes

Learn You a MLSys for Great Good (in public)!

GPU Kernels

RL Training Frameworks

Low-precision Data Type

Speculative Decoding

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages