pySBD VS razdel

Compare pySBD vs razdel and see what are their differences.

pySBD

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box. (by nipunsadvilkar)

razdel

Rule-based token, sentence segmentation for Russian language (by natasha)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
pySBD razdel
3 1
733 244
- 0.4%
0.0 2.1
8 months ago 10 months ago
Python Python
MIT License MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pySBD

Posts with mentions or reviews of pySBD. We have used some of these posts to build our list of alternatives and similar projects.

razdel

Posts with mentions or reviews of razdel. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-06-20.
  • Silero V3: fast high-quality text-to-speech in 20 languages with 173 voices
    9 projects | news.ycombinator.com | 20 Jun 2022
    Also currently we abandoned batching, so GPUs are not really required at all.

    > the quality (as in: what I'm hearing, not a formally measured metric) is good but (YMMV) not as good as turtle.

    I believe the compute required during training and inference … may differ by 3 or 4 orders of magnitude (!).

    Also note, that some speakers and languages just sound better due to high quality of source material and the amount of work invested and polish.

    > it breaks with strange error messages if the text you feed it is too long

    Well, there should be a warning somewhere, but it works with text no longer than 512-1024 symbols.

    > there is mention of "a model for text repunctuation and recapitalization", which I wonder if it could be used to break a very long text (eg a book) into pieces that can be digested by the tts engine

    This model only restores some punctuation marks and capital letters.

    There are libraries like razdel for this - https://github.com/natasha/razdel

What are some alternatives?

When comparing pySBD and razdel you can also consider the following projects:

pycrown - PyCrown - Fast raster-based individual tree segmentation for LiDAR data

silero-models - Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

dipy - DIPY is the paragon 3D/4D+ imaging library in Python. Contains generic methods for spatial normalization, signal processing, machine learning, statistical analysis and visualization of medical images. Additionally, it contains specialized methods for computational anatomy including diffusion, perfusion and structural imaging.

ttsprech - Simple text2speech for the command line

Sentence-Adder-Anki-Addon - Add sentences to Anki editor window in one click

Voice-Cloning-App - A Python/Pytorch app for easily synthesising human voices

caer - High-performance Vision library in Python. Scale your research, not boilerplate.

wtpsplit - Code for Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation

albumentations - Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125

spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python

add-stress-to-epub - A program that sets the stress and the letter ё of Russian text and ebooks using Wiktionary data and grammar analysis.