polyglot vs Stanza

polyglot

Multilingual text (NLP) processing toolkit (by aboSamoor)

Natural Language Processing

Source Code

polyglot-nlp.com

Suggest alternative

Edit details

Stanza

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages (by stanfordnlp)

Natural Language Processing General Python NLP Machine Learning Deep Learning Artificial intelligence Pytorch universal-dependencies named-entity-recognition Corenlp

Source Code

stanfordnlp.github.io

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

polyglot		Stanza
	Project
1	Mentions	8
2,261	Stars	7,047
-	Growth	1.1%
0.0	Activity	9.8
6 months ago	Latest Commit	4 days ago
Python	Language	Python
GNU General Public License v3.0 or later	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

polyglot

Posts with mentions or reviews of polyglot. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-09-25.

How different transliteration libraries compare (unihandecode, polyglot, ntlk?)
2 projects | /r/learnpython | 25 Sep 2021

polyglot

Stanza

Posts with mentions or reviews of Stanza. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-06.

Down and Out in the Magic Kingdom
1 project | news.ycombinator.com | 23 Jul 2023
Parts of speech tagged for German
3 projects | /r/German | 6 Jan 2023

I use Python's spacy library: https://spacy.io/models/de or stanza: https://stanfordnlp.github.io/stanza/ each with their respective language models.
Off the shelf sentence parsers?
2 projects | /r/LanguageTechnology | 26 Aug 2022

stanza has a constituency parser. There's a model compatible with the dev branch with an accuracy of 95.8 on PTB, using Roberta as a bottom layer, so it's pretty decent at this point. (The currently released model is not as accurate, but it's easy to get the better model to you.) There's also Tregex as a Java addon which can very easily search for a noun phrase highest up in the tree: NP !>> NP will search for a noun phrase which is not dominated by any higher up noun phrase.
The Spacy NER model for Spanish is terrible
2 projects | /r/LanguageTechnology | 20 Dec 2021
Spacy vs NLTK for Spanish Language Statistical Tasks
1 project | /r/LanguageTechnology | 12 Nov 2021
Stanza not tokenising sentences as expected
1 project | /r/learnpython | 3 Nov 2021

I am using Stanza to tokenise the sentences:
Stanza – A Python NLP Package for Many Human Languages
1 project | /r/programming | 29 Oct 2021

1 project | news.ycombinator.com | 27 Oct 2021

What are some alternatives?

When comparing polyglot and Stanza you can also consider the following projects:

spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python

NLTK - NLTK Source

TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

BERT-NER - Pytorch-Named-Entity-Recognition-with-BERT

langid.py - Stand-alone language identification system

Jieba - 结巴中文分词

flair - A very simple framework for state-of-the-art Natural Language Processing (NLP)

Pattern - Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

pytext - A natural language modeling framework based on PyTorch