vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install vllm

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Files
0.5.3.post1 2024-07-23    
0.5.3 2024-07-23    
0.5.2 2024-07-15    
0.5.1 2024-07-06    
0.5.0.post1 2024-06-14    
0.5.0 2024-06-11    
0.4.3 2024-06-01    
0.4.2 2024-05-05    
0.4.1 2024-04-24    
0.4.0.post1 2024-04-03
0.4.0 2024-03-31
0.3.3 2024-03-01    
0.3.2 2024-02-21
0.3.1 2024-02-17    
0.3.0 2024-01-31    
0.2.7 2024-01-04    
0.2.6 2023-12-17    
0.2.5 2023-12-14    
0.2.4 2023-12-11    
0.2.3 2023-12-03    
0.2.2 2023-11-19    
0.2.1.post1 2023-10-17    
0.2.1 yanked 2023-10-16    
0.2.0 2023-09-28    
0.1.7 2023-09-11    
0.1.6 2023-09-08    
0.1.5 2023-09-08    
0.1.4 2023-08-25    
0.1.3 2023-08-02    
0.1.2 2023-07-05    
0.1.1 2023-06-22    
0.1.0 2023-06-20    
0.0.1 2023-06-19    

Issues with this package?

Page last updated 2024-07-23 23:50:49 UTC