langid.py
py3langid
langid.py | py3langid | |
---|---|---|
2 | - | |
2,371 | 55 | |
1.3% | - | |
0.0 | 4.7 | |
over 5 years ago | 5 months ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
langid.py
-
Curator v0.1.0: Auto-organize large movie collections (AI language detection+sync)
Right now it's in early stages: It can detect languages from audio and subtitles (Whisper+LangID) with good results so far tried with 52 movies here (failed with just 1 which was silent). I'm currently working on synchronization: Hopefully subtitle timestamps and audio sound effects can suffice for cross-correlation. After that, I'll work on the TUI (maybe add a proper GUI too) to improve UX.
-
Announcing Lingua 1.0.0: The most accurate natural language detection library for Python, suitable for long and short text alike
Python is widely used in natural language processing, so there are a couple of comprehensive open source libraries for this task, such as Google's CLD 2 and CLD 3, langid and langdetect. Unfortunately, except for the last one they have two major drawbacks:
py3langid
We haven't tracked posts mentioning py3langid yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
polyglot - Multilingual text (NLP) processing toolkit
pntl - Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER), semantic role labeling (SRL) and syntactic parsing (PSG) with skip-gram all in Python and still more features will be added. The website give is for downlarding Senna tool
TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
IEPY - Information Extraction in Python
textacy - NLP, before and after spaCy
NLTK - NLTK Source