BitDecoding

BitDecoding is a high-performance, GPU-optimized system designed to accelerate long-context LLMs decoding with a low-bit KV cache. Achieve 3-9x speedup than Flash Attention v2.

Benchmark

Kernel Performance in RTX4090
Kernel Performance in A100

Installation

git clone --recursive https://github.com/DD-DuDa/BitDecoding.git
conda create -n bitdecode python=3.10
conda activate bitdecode
pip install -r requirements.txt
python setup.py install

Quick Start

See benchmark/bench_single_decode.ipynb

(Optional) Play with libtorch c++

# download libtorch 

cd BitDecoding/csrc/bit_decode
mkdir build && cd build
cmake -DCMAKE_PREFIX_PATH=<libtorch_path> ..
make -j12

End2end inference example, please see e2e

Citation

If you find BitDecoding useful or want to use in your projects, please kindly cite our paper:

@misc{du2025bitdecodingunlockingtensorcores,
      title={BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV Cache}, 
      author={Dayou Du and Shijie Cao and Jianyi Cheng and Ting Cao and Mao Yang},
      year={2025},
      eprint={2503.18773},
      archivePrefix={arXiv},
      primaryClass={cs.AR},
      url={https://arxiv.org/abs/2503.18773}, 
}

Acknowledgement

BitDecoding is inspired by many open-source libraries, including (but not limited to) flash-attention, flute, Atom, omniserve, KIVI.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
benchmark		benchmark
bit_decode		bit_decode
csrc/bit_decode		csrc/bit_decode
imgs		imgs
libs		libs
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BitDecoding

Benchmark

Installation

Quick Start

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

DD-DuDa/BitDecoding

Folders and files

Latest commit

History

Repository files navigation

BitDecoding

Benchmark

Installation

Quick Start

Citation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages