Gentoo Portage Overlays - dev-python/flashinfer-python

dev-python/flashinfer-python

FlashInfer: kernel library for LLM serving (Python frontend)

Screenshots

flashinfer-python-0.6.14

~amd64

python_single_target_python3_12 python_single_target_python3_13 python_single_target_python3_14

View

Download

Browse License: Apache-2.0

Overlay: stuff

flashinfer-python-0.6.13

~amd64

python_single_target_python3_12 python_single_target_python3_13 python_single_target_python3_14

View

Download

Browse License: Apache-2.0

Overlay: stuff

flashinfer-python-0.6.12

~amd64

python_single_target_python3_12 python_single_target_python3_13 python_single_target_python3_14

View

Download

Browse License: Apache-2.0

Overlay: stuff

flashinfer-python-0.6.11_p2

~amd64

python_single_target_python3_12 python_single_target_python3_13 python_single_target_python3_14

View

Download

Browse License: Apache-2.0

Overlay: stuff

ChangeLog USE Flags Dependencies Reverse Deps Related Bugs

ChangeLog

commit 6be5572176683bd38d6f542b3afd9ea7c9fe9c13
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Wed Jul 8 10:58:04 2026 +0200

dev-python/flashinfer-python: add 0.6.14

commit dd949dba014f0ab03bf9f4d94b55d9b4b5ff34b4
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Thu Jun 25 17:03:13 2026 +0200

dev-python/flashinfer-python: add 0.6.13

commit 79f0ec86febf86a80f5096504e49a75696e45f7e
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Tue Jun 16 15:28:47 2026 +0200

dev-python/flashinfer-python: drop 0.6.8_p1

commit a07b7a9df7981d3d1ec6e8442c661bdcdc03a531
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sat Jun 6 22:07:59 2026 +0200

dev-python/flashinfer-python: drop 0.6.11, 0.6.11_p3

commit 68de1b47a23f51fed8162033154ed94867742bd9
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sat May 30 14:21:06 2026 +0200

dev-python/flashinfer-python: bump 0.6.11_p3 -> 0.6.12

Pairs with the flashinfer-cubin bump. Upstream's requires_dist diff
shows only optional-extra changes (new "nvep" extra pulling
cuda-python; the "cuda-tile[tileiras]" extra is satisfied by our
existing cuda-tile-bin dep). No baseline RDEPEND changes.

commit f3c33810e93cb9e856ed026eb4f1a93f2b6cac3c
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sun May 17 00:52:23 2026 +0200

dev-python/flashinfer-python: align 0.6.8_p1 with vllm cuda.txt

Upstream vllm 0.21.0's cuda.txt enforces:
- nvidia-cudnn-frontend>=1.13.0,<1.19.0 (breaking changes in 1.19)
- nvidia-cutlass-dsl==4.4.2 (exact pin)

Apply both to 0.6.8_p1 so vllm[cuda] resolves. Newer 0.6.11.x
versions left alone — they may have been retested against
cudnn-frontend >=1.19 / cutlass-dsl >=4.5; re-audit at consumer-pin
time before tightening those.

commit a53be869a67163ee151dee638983986667793284
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sat May 16 13:11:07 2026 +0200

dev-python/flashinfer-python: add 0.6.11_p3

PyPI sdist bump: .post2 -> .post3 (our _p2 -> _p3). Pulls flashinfer-cubin ~_p3 via
the version-coupled cond-dep.

commit 3e0f976fd200159cc40eac40cee48f72f5c8ac58
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Thu May 14 12:01:00 2026 +0200

dev-python/flashinfer-python: add 0.6.11_p2

commit 9b5ba1580af42fb72b97b24231dfae2f64d1a5b3
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Wed May 13 14:34:48 2026 +0200

dev-python/flashinfer-python: disable py3.11

commit b8143a6ca85d0e978675a1f683b1893cccc119af
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sun May 10 15:26:08 2026 +0200

dev-python/flashinfer-python: switch to DISTUTILS_SINGLE_IMPL

sci-ml/pytorch is SINGLE_IMPL; multi-impl consumer with bare
$ produces python_targets_python3_*(-)? that the
single-impl child can't expose. The pytorch dep moves to bare
$ and the rest of the (multi-impl) deps move
into python_gen_cond_dep.

commit a1b5a8250d662e8d8600990d5dd9578bba113622
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sun May 10 16:49:49 2026 +0200

dev-python/flashinfer-python: switch to DISTUTILS_SINGLE_IMPL

sci-ml/pytorch is SINGLE_IMPL; multi-impl consumer with bare
$ produces python_targets_python3_*(-)? that the
single-impl child can't expose. The pytorch dep moves to bare
$ and the rest of the (multi-impl) deps move
into python_gen_cond_dep.

(Hand-port of the same conversion landing on 0.6.8_p1 to our newer
0.6.11 ebuild.)

commit 97b49dbfa7221a4d42d6f9ac5264f2f5a97d5c5c
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sun May 10 15:04:33 2026 +0200

dev-python/flashinfer-python: add 0.6.11

commit b4cf378def63a936ade846dce872c3faa9bd3bcf
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Thu May 7 18:38:08 2026 +0200

dev-python/flashinfer-python: new package, 0.6.8.post1

Tier 5 of the vllm CUDA target packaging cycle. FlashInfer's Python
frontend — the kernel library that vllm's CUDA target dispatches
attention / fused-MoE / GEMM workloads through.

Pure-Python install at packaging time: the C++ csrc/include and
vendored cutlass / spdlog ride along as data files for the runtime
JIT compilation pipeline. No nvcc invocation at install, so the
gcc-15 host-pin from tilelang isn't needed here. (At first JIT use,
end-users hit the same nvcc/gcc-15 compatibility window — that's
their build environment's responsibility.)

Two ebuild-side fixes:

* Same upstream py-modules-leak issue as torch-c-dlpack-ext, but with
TWO modules — pyproject.toml's [tool.setuptools] py-modules =
["build_backend", "build_utils"] would ship both PEP-517 backend
helpers at the top of site-packages. Drop both in
python_install_all.

* Upstream's requirements.txt lists nvidia-ml-py as the runtime dep,
but at first import flashinfer/utils.py does `import pynvml` —
the legacy module name. ::gentoo's dev-python/nvidia-ml-py
installs a `pynvml.py` shim at /usr/lib/pythonX.Y/site-packages/
pynvml.py, so the runtime dep is satisfied as long as nvidia-ml-py
is the actually-installed package. RDEPEND already names it; flagging
here so a future pip-equivalent lookup doesn't regress to a separate
`dev-python/pynvml` (which doesn't exist in ::gentoo).

PV translates _p1 ← .post1 (Gentoo PMS forbids .postN); the pypi
eclass handles the auto-derivation.