Release v0.2.1.post2 · flashinfer-ai/flashinfer

What's Changed

use 3 latest pytorch version by @youkaichao in #835
docs: update installation by @zhyncs in #839
Update README.md: fixing a typo for "hierical" by @didier-durand in #836
Update page.rst: fixing 1 typo by @didier-durand in #841
Update README.md: fixing 1 typo by @didier-durand in #842
adds TensorRT-LLM to the list of projects adopting FlashInfer by @yzh119 in #843
perf: MLA decode kernel implemented by CuTe targeted to SM80 by @tsu-bin in #844
Update installation.rst: fixing 2 typos by @didier-durand in #840
fix: Pass backend in BatchPrefillWith*KVCacheWrapper.plan() by @sfc-gh-yewang in #808
bugfix: Fix inline RoPE in decode kernels by @MasterJH5574 in #847
misc: Remove duplicate param set in MLA kernel by @MasterJH5574 in #850
feat: adding out and lse parameters to run functions to allow user allocated output buffer by @yzh119 in #854
Unique the symbol of maybe_q_rope_offset_v. by @foreverlms in #855
typo: update decode_maybe_q_rope_offset by @MasterJH5574 in #856
update ci by @zhyncs in #857
fix some compiler pre-check. by @foreverlms in #859
perf: dynamic split-k for MLA by @yzh119 in #863
Revert "fix: Pass backend in BatchPrefillWith*KVCacheWrapper.plan() (… by @zhyncs in #864
chore: bump v0.2.1.post2 by @zhyncs in #865
fix compile by @zhyncs in #866

Full Changelog: v0.2.1.post1...v0.2.1.post2