pySBD
razdel
pySBD | razdel | |
---|---|---|
3 | 1 | |
733 | 244 | |
- | 0.4% | |
0.0 | 2.1 | |
8 months ago | 10 months ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pySBD
- Stand-alone sentence segmenter
-
Help with Sentence Splitting
https://github.com/nipunsadvilkar/pySBD This will solve your case.
-
Help with Sentence splitting
https://github.com/nipunsadvilkar/pySBD This might help you. This is the best sentence splitter I ever came across.
razdel
-
Silero V3: fast high-quality text-to-speech in 20 languages with 173 voices
Also currently we abandoned batching, so GPUs are not really required at all.
> the quality (as in: what I'm hearing, not a formally measured metric) is good but (YMMV) not as good as turtle.
I believe the compute required during training and inference … may differ by 3 or 4 orders of magnitude (!).
Also note, that some speakers and languages just sound better due to high quality of source material and the amount of work invested and polish.
> it breaks with strange error messages if the text you feed it is too long
Well, there should be a warning somewhere, but it works with text no longer than 512-1024 symbols.
> there is mention of "a model for text repunctuation and recapitalization", which I wonder if it could be used to break a very long text (eg a book) into pieces that can be digested by the tts engine
This model only restores some punctuation marks and capital letters.
There are libraries like razdel for this - https://github.com/natasha/razdel
What are some alternatives?
pycrown - PyCrown - Fast raster-based individual tree segmentation for LiDAR data
silero-models - Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
dipy - DIPY is the paragon 3D/4D+ imaging library in Python. Contains generic methods for spatial normalization, signal processing, machine learning, statistical analysis and visualization of medical images. Additionally, it contains specialized methods for computational anatomy including diffusion, perfusion and structural imaging.
ttsprech - Simple text2speech for the command line
Sentence-Adder-Anki-Addon - Add sentences to Anki editor window in one click
Voice-Cloning-App - A Python/Pytorch app for easily synthesising human voices
caer - High-performance Vision library in Python. Scale your research, not boilerplate.
wtpsplit - Code for Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation
albumentations - Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
add-stress-to-epub - A program that sets the stress and the letter ё of Russian text and ebooks using Wiktionary data and grammar analysis.