vllm-online

A high-throughput and memory-efficient inference and serving engine for LLMs

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install vllm-online

Dependencies

  • None

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Trixie
Python 3.13
Files
0.4.2 2024-04-29      
0.4.1 2024-04-29      

Issues with this package?

Page last updated 2025-09-14 03:17:30 UTC