SemHash is a lightweight, multimodal library for semantic deduplication, outlier filtering, and representative sample selection. Text works out of the box with fast Model2Vec embeddings, and images, ...
WhisperS2T is an optimized lightning-fast open-sourced Speech-to-Text (ASR) pipeline. It is tailored for the whisper model to provide faster whisper transcription. It's designed to be exceptionally ...