gpo.zugaina.org

Search Portage & Overlays:

dev-python/humming-kernels

JIT-compiled quantization GEMM kernel library (vLLM humming backend)

Screenshots

  • humming-kernels-0.1.4
    ~amd64
    python_single_target_python3_12 python_single_target_python3_13 python_single_target_python3_14

    View      Download      Browse     License: Apache-2.0   
    Overlay: stuff
  • humming-kernels-0.1.2
    ~amd64
    python_single_target_python3_12 python_single_target_python3_13 python_single_target_python3_14

    View      Download      Browse     License: Apache-2.0   
    Overlay: stuff

ChangeLog

commit 1fc4e0437a00dcaf4e556a74d8ee6c40ed6b905a
Author: Ivan S. Titov <iohann.s.titov@gmail.com>
Date: Sun Jun 14 23:40:12 2026 +0200

dev-python/humming-kernels: new package

humming-kernels provides the `humming` Python namespace that vLLM's
quantization registry imports unconditionally on CUDA builds
(quantization/__init__.py pulls in humming.py, whose external import is
gated only by current_platform.is_cuda() with no fallback). Without it,
loading any quantized model under vllm[cuda] aborts with
ModuleNotFoundError: No module named 'humming', regardless of which
quantization method was actually requested.

JIT GEMM kernel library: a pure-Python wheel that compiles its bundled
CUDA sources at runtime via the system nvcc. 0.1.2 and 0.1.4 match the
humming-kernels pins in vllm 0.22.1 and 0.23.0 requirements/cuda.txt.