langid.py
Stand-alone language identification system (by saffsd)
stanfordnlp
[Deprecated] This library has been renamed to "Stanza". Latest development at: https://github.com/stanfordnlp/stanza (by stanfordnlp)
langid.py | stanfordnlp | |
---|---|---|
2 | - | |
2,242 | 111 | |
- | 0.9% | |
0.0 | 3.2 | |
over 4 years ago | 8 months ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
langid.py
Posts with mentions or reviews of langid.py.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-01-22.
-
Curator v0.1.0: Auto-organize large movie collections (AI language detection+sync)
Right now it's in early stages: It can detect languages from audio and subtitles (Whisper+LangID) with good results so far tried with 52 movies here (failed with just 1 which was silent). I'm currently working on synchronization: Hopefully subtitle timestamps and audio sound effects can suffice for cross-correlation. After that, I'll work on the TUI (maybe add a proper GUI too) to improve UX.
-
Announcing Lingua 1.0.0: The most accurate natural language detection library for Python, suitable for long and short text alike
Python is widely used in natural language processing, so there are a couple of comprehensive open source libraries for this task, such as Google's CLD 2 and CLD 3, langid and langdetect. Unfortunately, except for the last one they have two major drawbacks:
stanfordnlp
Posts with mentions or reviews of stanfordnlp.
We have used some of these posts to build our list of alternatives
and similar projects.
We haven't tracked posts mentioning stanfordnlp yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
When comparing langid.py and stanfordnlp you can also consider the following projects:
polyglot - Multilingual text (NLP) processing toolkit
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
pytext - A natural language modeling framework based on PyTorch
py3langid - Faster, modernized fork of the language identification tool langid.py
Jieba - 结巴中文分词
Stanza - Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
NLTK - NLTK Source
PyTorch-NLP - Basic Utilities for PyTorch Natural Language Processing (NLP)