GPTQModel

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install gptqmodel

Dependencies

  • None

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Trixie
Python 3.13
Files
6.0.3 2026-04-03      
6.0.0 2026-04-02      
5.8.0 2026-03-19      
5.7.0 2026-02-11      
5.6.12 2025-12-17      
5.6.10 2025-12-16      
5.6.8 2025-12-16      
5.6.6 2025-12-15      
5.6.2 2025-12-12      
5.6.0 2025-12-09      
5.4.2 2025-11-16      
5.4.0 2025-11-09      
5.2.0 2025-11-02      
5.0.0 2025-10-24      
4.2.5 2025-09-16      
4.2.0 2025-09-12      
4.1.0 2025-09-08      
4.0.0 2025-08-22      
2.2.0 2025-04-03      
2.1.0 2025-03-13      
2.0.0 2025-03-03      
1.9.0 2025-02-12      
1.8.1 2025-02-08      
1.8.0 2025-02-07      
1.7.4 2025-01-26      
1.7.3 2025-01-21      
1.7.2 2025-01-19      
1.7.0 2025-01-17      
1.6.0 2025-01-06      
1.5.1 2025-01-01      
1.5.0 2024-12-24      
1.4.5 2024-12-19      
1.4.4 2024-12-17      
1.4.3 2024-12-17      
1.4.2 2024-12-16      
1.4.1 2024-12-13      
1.4.0 2024-12-10      
1.3.1 2024-11-29      
1.3.0 2024-11-27      
1.2.3 2024-11-25      
1.2.1 2024-11-11      
1.2.1.dev0 pre-release 2024-11-11      
1.2.0 2024-11-11      
1.1.0 2024-10-29      
1.0.9 2024-10-13      
1.0.8 2024-10-11      
1.0.7 2024-10-08      
1.0.6 2024-09-26      
1.0.5 2024-09-26      
1.0.4 2024-09-26      
1.0.3 2024-09-19      
1.0.2 2024-08-17      
1.0.1 2024-08-15      

Issues with this package?

Page last updated 2026-04-03 07:52:50 UTC