Skip to content

v0.2.2.post1

Compare
Choose a tag to compare
@yzh119 yzh119 released this 27 Feb 06:00
· 186 commits to main since this release

What's Changed

  • bump version to v0.2.2 by @yzh119 in #891
  • perf: fix the performance of second stage of split-k by @yzh119 in #894
  • fix: pin_memory use cpu as default device by @KnowingNothing in #895
  • perf: tweak register amount for producer/consumer in MLA template by @yzh119 in #896
  • perf: fix MLA split-k performance bug by @yzh119 in #898
  • perf: use f16 as split-k partial output data type by @yzh119 in #900
  • perf: tweak the pipeline design of mla kernel by @yzh119 in #901

Full Changelog: v0.2.2...v0.2.2.post1