turboquant-plus-vllm

TurboQuant+ compression for vLLM. 4.3x weight compression + 3.7x KV cache, zero calibration.

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install turboquant-plus-vllm

Dependencies

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Trixie
Python 3.13
Files
0.13.7 2026-05-16      
0.13.3 2026-05-06      
0.13.2 2026-05-06      
0.13.1 2026-04-25      
0.13.0 2026-04-23      
0.11.0 2026-04-20      
0.7.0 2026-04-13      
0.6.0 2026-04-09      
0.5.1 2026-04-08      
0.5.0 2026-04-08      
0.4.0 2026-04-07      
0.3.0 2026-04-06      
0.2.0 2026-04-04      
0.1.0 2026-04-03      

Issues with this package?

Page last updated 2026-05-17 09:32:54 UTC