flash-attention-softmax-n

CUDA and Triton implementations of Flash Attention with SoftmaxN.

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install flash-attention-softmax-n

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Files
0.3.2 2023-11-21    
0.3.1 2023-09-23    
0.3.0 2023-09-05  
0.2.1 2023-08-30  
0.2.0 2023-08-29  
0.1.4 2023-08-28  
0.1.3 2023-08-28  
0.1.2 2023-08-26  
0.1.1 2023-08-26  
0.1.0 2023-08-26  
0.1.0rc6 pre-release 2023-08-26  

Issues with this package?

Page last updated 2025-06-27 17:06:38 UTC