nm-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install nm-vllm

Dependencies

  • None

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Trixie
Python 3.13
Files
0.6.3.0 2024-11-06
0.5.3.0 2024-09-05
0.5.2.0 2024-08-12
0.5.1.1 2024-07-17
0.4.0 2024-07-11      
0.2.0 2024-04-09      
0.1.0 2024-03-02

Issues with this package?

Page last updated 2025-09-14 09:33:22 UTC