selfclean

A holistic self-supervised data cleaning strategy to detect off-topic samples, near duplicates and label errors.

Installation

In a virtualenv (see these instructions if you need to create one):

pip3 install selfclean

Releases

Version Released Bullseye
Python 3.9
Bookworm
Python 3.11
Files
0.0.36 2025-03-13    
0.0.35 2025-01-07    
0.0.34 2025-01-06    
0.0.31 2024-10-31    
0.0.30 2024-10-12    
0.0.29 2024-10-12    
0.0.28 2024-10-04    
0.0.27 2024-09-12    
0.0.26 2024-08-27    
0.0.25 2024-08-26    
0.0.24 2024-07-02    
0.0.23 2024-07-02    
0.0.22 2024-05-01    
0.0.21 2024-05-01    
0.0.20 2024-05-01    
0.0.19 2024-04-04    
0.0.18 2024-04-04    
0.0.17 2024-03-23    
0.0.16 2024-03-23    
0.0.15 2024-03-22    
0.0.14 2024-03-22    
0.0.13 2024-03-22    
0.0.12 2024-03-22    
0.0.11 2024-03-22    
0.0.10 2024-03-22    
0.0.9 2024-03-22    
0.0.8 2024-03-21    
0.0.7 2024-03-21    
0.0.6 2024-03-21    
0.0.5 2024-03-19    
0.0.4 2024-03-19    
0.0.2 2024-03-19    
0.0.1 2024-03-19    

Issues with this package?

Page last updated 2025-06-27 20:36:03 UTC