langid.py
Stand-alone language identification system (by saffsd)
polyglot
Multilingual text (NLP) processing toolkit (by aboSamoor)
langid.py | polyglot | |
---|---|---|
2 | 1 | |
2,371 | 2,321 | |
1.3% | 0.0% | |
0.0 | 0.0 | |
over 5 years ago | over 1 year ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 only |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
langid.py
Posts with mentions or reviews of langid.py.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-01-22.
-
Curator v0.1.0: Auto-organize large movie collections (AI language detection+sync)
Right now it's in early stages: It can detect languages from audio and subtitles (Whisper+LangID) with good results so far tried with 52 movies here (failed with just 1 which was silent). I'm currently working on synchronization: Hopefully subtitle timestamps and audio sound effects can suffice for cross-correlation. After that, I'll work on the TUI (maybe add a proper GUI too) to improve UX.
-
Announcing Lingua 1.0.0: The most accurate natural language detection library for Python, suitable for long and short text alike
Python is widely used in natural language processing, so there are a couple of comprehensive open source libraries for this task, such as Google's CLD 2 and CLD 3, langid and langdetect. Unfortunately, except for the last one they have two major drawbacks:
polyglot
Posts with mentions or reviews of polyglot.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-09-25.
What are some alternatives?
When comparing langid.py and polyglot you can also consider the following projects:
py3langid - Faster, modernized fork of the language identification tool langid.py
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
textacy - NLP, before and after spaCy
NLTK - NLTK Source