Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →
Top 22 Python named-entity-recognition Projects
-
HanLP
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
Project mention: Hanlp - Natural language processing for the next decade | reddit.com/r/github_trends | 2022-05-28 -
Tools: Hugging Face SpaCy Scikit-Learn MLFlow There is no flag to discern a human owner vs a corporate entity, so you have to figure it out on your own. ML can assist given there are tens of thousands of records to go.
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
NLP-progress
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Project mention: [Discussion] Checklist of seminal NLP papers | reddit.com/r/MachineLearning | 2022-12-28 -
Project mention: Flair: A simple framework for state-of-the-art Natural Language Processing | news.ycombinator.com | 2022-04-11
-
I use Python's spacy library: https://spacy.io/models/de or stanza: https://stanfordnlp.github.io/stanza/ each with their respective language models.
-
simpletransformers
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
-
NCRFpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
entity-recognition-datasets
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Project mention: Any large manually annotated NER datasets? | reddit.com/r/LanguageTechnology | 2022-12-28 -
-
seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
-
nlu
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
-
Yep! I actually know of something that is exactly what you’re looking for. Note that you will need to know a bit of python to use it. Here’s the link: https://github.com/philipperemy/name-dataset
-
-
Project mention: Experienced software dev, beginner to NLP. Seeking beginner learning resources, with a specific leaning towards NLP of Chinese language. | reddit.com/r/LanguageTechnology | 2022-01-29
-
camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
-
Project mention: Show HN: Zshot, Zero and Few shot named entity and relationships recognition | news.ycombinator.com | 2022-10-28
-
Project mention: [P] HuSpaCy: Industrial-strength Hungarian NLP | reddit.com/r/MachineLearning | 2022-04-27
I'd like to show off a Hungarian NLP pipeline which we've been heavily improving over the past year. https://github.com/huspacy/huspacy
-
genius
💡GENIUS – generating text using sketches! A strong and general textual data augmentation tool.
Project mention: Best language model for filling multiple related masks [D] | reddit.com/r/MachineLearning | 2023-01-09See https://github.com/beyondguo/genius
-
healthsea
Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.
-
ARElight
AREkit-based application for a granular view onto sentiments between entities in a mass-media texts written in Russian
Show HN: ARElight – A Mass-Media Processing Application for Relation Extraction\ (8 comments)
-
embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
Check out our embedders library if you want to build such embeddings using a high-level, Scikit-Learn-like API.
-
jouresearch-nlp
A python package for generating topics, named entities and a wordcloud visualization. It leverages the SpaCy framework and sentence transformers.
Project mention: Looking for career advice as beginner Python developer with NLP and Backend experience | reddit.com/r/learnpython | 2022-10-05As a Freelancer - Developed a package for providing meaningful insights on text for journalists (Open Source version of it: https://github.com/joureka-ai/jouresearch-nlp)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python named-entity-recognition related posts
- One does not simply "create a visualization" from unstructured data!
- Best language model for filling multiple related masks [D]
- Requesting surname data with frequency
- Any large manually annotated NER datasets?
- [Discussion] Checklist of seminal NLP papers
- Has anyone here ever used the seaNMF model for short text topic modeling, and be willing to help me get started with it?
- Hi everyone, my first Reddit post, let me introduce the GENIUS model.
-
A note from our sponsor - Sonar
www.sonarsource.com | 26 Jan 2023
Index
What are some of the best open-source named-entity-recognition projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | HanLP | 27,855 |
2 | spaCy | 25,034 |
3 | NLP-progress | 21,260 |
4 | flair | 12,396 |
5 | Stanza | 6,465 |
6 | simpletransformers | 3,475 |
7 | NCRFpp | 1,846 |
8 | entity-recognition-datasets | 1,269 |
9 | BERT-NER | 1,079 |
10 | seqeval | 884 |
11 | nlu | 610 |
12 | name-dataset | 599 |
13 | BERTweet | 498 |
14 | ckip-transformers | 382 |
15 | camel_tools | 280 |
16 | zshot | 194 |
17 | huspacy | 122 |
18 | genius | 110 |
19 | healthsea | 72 |
20 | ARElight | 34 |
21 | embedders | 15 |
22 | jouresearch-nlp | 3 |