polyglot vs langid.py

polyglot

Multilingual text (NLP) processing toolkit (by aboSamoor)

Natural Language Processing

Source Code

polyglot-nlp.com

Suggest alternative

Edit details

langid.py

Stand-alone language identification system (by saffsd)

Natural Language Processing

Source Code

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

polyglot		langid.py
	Project
1	Mentions	2
2,261	Stars	2,242
-	Growth	-
0.0	Activity	0.0
6 months ago	Latest Commit	over 4 years ago
Python	Language	Python
GNU General Public License v3.0 or later	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

polyglot

Posts with mentions or reviews of polyglot. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-09-25.

How different transliteration libraries compare (unihandecode, polyglot, ntlk?)
2 projects | /r/learnpython | 25 Sep 2021

polyglot

langid.py

Posts with mentions or reviews of langid.py. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-22.

Curator v0.1.0: Auto-organize large movie collections (AI language detection+sync)
3 projects | /r/jellyfin | 22 Jan 2023

Right now it's in early stages: It can detect languages from audio and subtitles (Whisper+LangID) with good results so far tried with 52 movies here (failed with just 1 which was silent). I'm currently working on synchronization: Hopefully subtitle timestamps and audio sound effects can suffice for cross-correlation. After that, I'll work on the TUI (maybe add a proper GUI too) to improve UX.
Announcing Lingua 1.0.0: The most accurate natural language detection library for Python, suitable for long and short text alike
5 projects | /r/Python | 10 Jan 2022

Python is widely used in natural language processing, so there are a couple of comprehensive open source libraries for this task, such as Google's CLD 2 and CLD 3, langid and langdetect. Unfortunately, except for the last one they have two major drawbacks:

What are some alternatives?

When comparing polyglot and langid.py you can also consider the following projects:

spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python

TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

NLTK - NLTK Source

py3langid - Faster, modernized fork of the language identification tool langid.py

Stanza - Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Jieba - 结巴中文分词

stanfordnlp - [Deprecated] This library has been renamed to "Stanza". Latest development at: https://github.com/stanfordnlp/stanza

Pattern - Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

polyglot vs spaCy langid.py vs TextBlob polyglot vs NLTK langid.py vs py3langid polyglot vs TextBlob langid.py vs spaCy polyglot vs Stanza langid.py vs NLTK polyglot vs Jieba langid.py vs stanfordnlp polyglot vs Pattern langid.py vs Jieba

Compare polyglot vs langid.py and see what are their differences.

polyglot

langid.py

polyglot

langid.py

What are some alternatives?