real-wordpiece

A score-based implementation of WordPiece tokenization training, compatible with HuggingFace tokenizers.

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install real-wordpiece

Dependencies

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Trixie
Python 3.13
Files
0.1.7 2024-08-30      
0.1.6 2024-07-19      
0.1.5 2024-07-16      

Issues with this package?

Page last updated 2025-11-13 10:01:18 UTC