llama-optimus

llama-optimus is a lightweight Python tool to automatically optimize llama.cpp performance flags for maximum tg & pp token/s throughput. Powered by Bayesian optimization with Optuna

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install llama-optimus

Dependencies

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Trixie
Python 3.13
Files
0.1.9 2025-06-30      
0.1.8 2025-06-21      
0.1.7 2025-06-20      
0.1.6 2025-06-18      
0.1.5 2025-06-18      
0.1.4 2025-06-17      
0.1.3 2025-06-17      
0.1.1 2025-06-14      

Issues with this package?

Page last updated 2026-05-13 07:50:16 UTC