dev-python/flashinfer-python
FlashInfer: kernel library for LLM serving (Python frontend)
ChangeLog
commit b4cf378def63a936ade846dce872c3faa9bd3bcf
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Thu May 7 18:38:08 2026 +0200
dev-python/flashinfer-python: new package, 0.6.8.post1
Tier 5 of the vllm CUDA target packaging cycle. FlashInfer's Python
frontend — the kernel library that vllm's CUDA target dispatches
attention / fused-MoE / GEMM workloads through.
Pure-Python install at packaging time: the C++ csrc/include and
vendored cutlass / spdlog ride along as data files for the runtime
JIT compilation pipeline. No nvcc invocation at install, so the
gcc-15 host-pin from tilelang isn't needed here. (At first JIT use,
end-users hit the same nvcc/gcc-15 compatibility window — that's
their build environment's responsibility.)
Two ebuild-side fixes:
* Same upstream py-modules-leak issue as torch-c-dlpack-ext, but with
TWO modules — pyproject.toml's [tool.setuptools] py-modules =
["build_backend", "build_utils"] would ship both PEP-517 backend
helpers at the top of site-packages. Drop both in
python_install_all.
* Upstream's requirements.txt lists nvidia-ml-py as the runtime dep,
but at first import flashinfer/utils.py does `import pynvml` —
the legacy module name. ::gentoo's dev-python/nvidia-ml-py
installs a `pynvml.py` shim at /usr/lib/pythonX.Y/site-packages/
pynvml.py, so the runtime dep is satisfied as long as nvidia-ml-py
is the actually-installed package. RDEPEND already names it; flagging
here so a future pip-equivalent lookup doesn't regress to a separate
`dev-python/pynvml` (which doesn't exist in ::gentoo).
PV translates _p1 ← .post1 (Gentoo PMS forbids .postN); the pypi
eclass handles the auto-derivation.
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Thu May 7 18:38:08 2026 +0200
dev-python/flashinfer-python: new package, 0.6.8.post1
Tier 5 of the vllm CUDA target packaging cycle. FlashInfer's Python
frontend — the kernel library that vllm's CUDA target dispatches
attention / fused-MoE / GEMM workloads through.
Pure-Python install at packaging time: the C++ csrc/include and
vendored cutlass / spdlog ride along as data files for the runtime
JIT compilation pipeline. No nvcc invocation at install, so the
gcc-15 host-pin from tilelang isn't needed here. (At first JIT use,
end-users hit the same nvcc/gcc-15 compatibility window — that's
their build environment's responsibility.)
Two ebuild-side fixes:
* Same upstream py-modules-leak issue as torch-c-dlpack-ext, but with
TWO modules — pyproject.toml's [tool.setuptools] py-modules =
["build_backend", "build_utils"] would ship both PEP-517 backend
helpers at the top of site-packages. Drop both in
python_install_all.
* Upstream's requirements.txt lists nvidia-ml-py as the runtime dep,
but at first import flashinfer/utils.py does `import pynvml` —
the legacy module name. ::gentoo's dev-python/nvidia-ml-py
installs a `pynvml.py` shim at /usr/lib/pythonX.Y/site-packages/
pynvml.py, so the runtime dep is satisfied as long as nvidia-ml-py
is the actually-installed package. RDEPEND already names it; flagging
here so a future pip-equivalent lookup doesn't regress to a separate
`dev-python/pynvml` (which doesn't exist in ::gentoo).
PV translates _p1 ← .post1 (Gentoo PMS forbids .postN); the pypi
eclass handles the auto-derivation.


View
Download
Browse