npu-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install npu-vllm

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Files
0.4.2.post3 2025-01-16    
0.4.2.post2 2025-01-16    
0.4.2.post1 2025-01-16    

Issues with this package?

Page last updated 2025-01-16 07:06:50 UTC