dev-python/quack-kernels
QuACK attention kernels for vLLM/FlashInfer (cute-DSL based)
ChangeLog
commit 50dc7eccdb67c47628ae42212108f2e6678b22d1
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Thu May 7 18:28:27 2026 +0200
dev-python/quack-kernels: new package, 0.3.3
Tier 4 of the vllm CUDA target packaging cycle. QuACK attention
kernels — pure-Python wrappers that JIT-compile attention/elementwise
kernels through nvidia-cutlass-dsl + apache-tvm-ffi at first use.
All deps are already in this overlay's tracked stack. Upstream's
cu13 extra is folded into the unconditional nvidia-cutlass-dsl
dep — our nvidia-cutlass-dsl always pulls the cu13 path on this
amd64+CUDA-13 overlay, so no USE flag is needed.
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Thu May 7 18:28:27 2026 +0200
dev-python/quack-kernels: new package, 0.3.3
Tier 4 of the vllm CUDA target packaging cycle. QuACK attention
kernels — pure-Python wrappers that JIT-compile attention/elementwise
kernels through nvidia-cutlass-dsl + apache-tvm-ffi at first use.
All deps are already in this overlay's tracked stack. Upstream's
cu13 extra is folded into the unconditional nvidia-cutlass-dsl
dep — our nvidia-cutlass-dsl always pulls the cu13 path on this
amd64+CUDA-13 overlay, so no USE flag is needed.


View
Download
Browse