funNLP vs langid.py

funNLP

中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具、国内电话号码正则匹配、清华大学XLORE:中英文跨语言百科知识图谱、清华大学人工智能技术系列报告、自然语言生成、NLU太难了系列、自动对联数据及机器人、用户名黑名单列表、罪名法务名词及分类模型、微信公众号语料、cs224n深度学习自然语言处理课程、中文手写汉字识别、中文自然语言处理语料/数据集、变量命名神器、分词语料库+代码、任务型对话英文数据集、ASR 语音数据集 + 基于深度学习的中文语音识别系统、笑声检测器、Microsoft多语言数字/单位/如日期时间识别包、中华新华字典数据库及api(包括常用歇后语、成语、词语和汉字)、文档图谱自动生成、SpaCy 中文模型、Common Voice语音识别数据集新版、神经网络关系抽取、基于bert的命名实体识别、关键词(Keyphrase)抽取包pke、基于医疗领域知识图谱的问答系统、基于依存句法与语义角色标注的事件三元组抽取、依存句法分析4万句高质量标注数据、cnocr：用来做中文OCR的Python3包、中文人物关系知识图谱项目、中文nlp竞赛项目及代码汇总、中文字符数据、speech-aligner: 从“人声语音”及其“语言文本”产生音素级别时间对齐标注的工具、AmpliGraph: 知识图谱表示学习(Python)库：知识图谱概念链接预测、Scattertext 文本可视化(python)、语言/知识表示工具：BERT & ERNIE、中文对比英文自然语言处理NLP的区别综述、Synonyms中文近义词工具包、HarvestText领域自适应文本挖掘工具（新词发现-情感分析-实体链接等）、word2word：(Py (by fighting41love)

Natural Language Processing Chinese

Source Code

zhuanlan.zhihu.com

Suggest alternative

Edit details

langid.py

Stand-alone language identification system (by saffsd)

Natural Language Processing

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

funNLP		langid.py
	Project
-	Mentions	2
64,177	Stars	2,242
-	Growth	-
3.7	Activity	0.0
about 2 months ago	Latest Commit	over 4 years ago
Python	Language	Python
-	License	BSD 3-clause "New" or "Revised" License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

funNLP

Posts with mentions or reviews of funNLP. We have used some of these posts to build our list of alternatives and similar projects.

We haven't tracked posts mentioning funNLP yet.
Tracking mentions began in Dec 2020.

langid.py

Posts with mentions or reviews of langid.py. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-22.

Curator v0.1.0: Auto-organize large movie collections (AI language detection+sync)
3 projects | /r/jellyfin | 22 Jan 2023

Right now it's in early stages: It can detect languages from audio and subtitles (Whisper+LangID) with good results so far tried with 52 movies here (failed with just 1 which was silent). I'm currently working on synchronization: Hopefully subtitle timestamps and audio sound effects can suffice for cross-correlation. After that, I'll work on the TUI (maybe add a proper GUI too) to improve UX.
Announcing Lingua 1.0.0: The most accurate natural language detection library for Python, suitable for long and short text alike
5 projects | /r/Python | 10 Jan 2022

Python is widely used in natural language processing, so there are a couple of comprehensive open source libraries for this task, such as Google's CLD 2 and CLD 3, langid and langdetect. Unfortunately, except for the last one they have two major drawbacks:

What are some alternatives?

When comparing funNLP and langid.py you can also consider the following projects:

TextBlob - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

polyglot - Multilingual text (NLP) processing toolkit

pkuseg-python - pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python

py3langid - Faster, modernized fork of the language identification tool langid.py

quepy - A python framework to transform natural language questions to queries in a database query language.

Jieba - 结巴中文分词

NLTK - NLTK Source

stanfordnlp - [Deprecated] This library has been renamed to "Stanza". Latest development at: https://github.com/stanfordnlp/stanza

funNLP vs TextBlob langid.py vs polyglot funNLP vs pkuseg-python langid.py vs TextBlob funNLP vs spaCy langid.py vs py3langid funNLP vs quepy langid.py vs spaCy funNLP vs Jieba langid.py vs NLTK funNLP vs NLTK langid.py vs stanfordnlp

Compare funNLP vs langid.py and see what are their differences.

funNLP

langid.py

funNLP

langid.py

What are some alternatives?