Top 10 Python named-entity-recognition Projects
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.Project mention: Upcoming App Announcement: Lemmatize, a Foreign Language Reader | reddit.com/r/languagelearning | 2021-11-11
A standard step in Chinese text processing is word segmentation, which deals with this problem.
A very simple framework for state-of-the-art Natural Language Processing (NLP)Project mention: How to create a dataset for training NER models when you only have entity data | reddit.com/r/LanguageTechnology | 2021-10-18
We have a list of entities in text files separated with a new line. We intend to train the flair model to detect these entities in text, but NER models require the entity to be labeled in a paragraph with BOI format.
Scout APM: A developer's best friend. Try free for 14-days. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
Official Stanford NLP Python Library for Many Human LanguagesProject mention: Spacy vs NLTK for Spanish Language Statistical Tasks | reddit.com/r/LanguageTechnology | 2021-11-12
Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AIProject mention: Gpt 2 124m using transformers | reddit.com/r/LanguageTechnology | 2021-06-14
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.Project mention: Speech and Language Processing (3rd ed. draft) | news.ycombinator.com | 2021-10-17
They still talk about Hidden Markov Models (HMMs) in quite a bit of detail in the sequence labelling chapter, but you are quite right, Conditional Random Fields (CRFs) and especially neural network based CRFs are in the top rankings when it comes to named entity recognition (NER) and part-of-speech tagging (POS), e.g. see https://github.com/jiesutd/NCRFpp.
Pytorch-Named-Entity-Recognition-with-BERTProject mention: Training NER models for detecting custom entities | reddit.com/r/LanguageTechnology | 2021-10-08
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)Project mention: Beginner questions about NER model evaluation. | reddit.com/r/LanguageTechnology | 2021-03-12
. The standard way to evaluate NER (or any other sequence labelling problem) is to use the conlleval script (https://www.clips.uantwerpen.be/conll2000/chunking/output.html) or through the seqeval package in python (https://github.com/chakki-works/seqeval) . Either way, you need a list of predicted labels and a list of gold labels (see the code example in the link, it should be trivial to converse your output to the same data format).
Run Linux Software Faster and Safer than Linux with Unikernels.
BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)Project mention: A pre-trained BERT-like model with recent events? | reddit.com/r/LanguageTechnology | 2021-06-24
Not sure if that's what you are looking for, but BERTweet has model trained on tweets containing COVID keywords https://github.com/VinAIResearch/BERTweet
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.Project mention: I have a problem in Arabic that I have no idea how to start solving. | reddit.com/r/LanguageTechnology | 2021-10-18
The CaMeL library does tashkeel: https://github.com/CAMeL-Lab/camel_tools.
Enables creation of complex NLP pipelines in seconds, for processing static files or streaming text, using a set of simple command line tools. Perform multiple operation on text like NER, Sentiment Analysis, Chunking, Language Identification, Q&A, 0-shot Classification and more by executing a single command in the terminalProject mention: NlphoseBuilder : A tool to create NLP pipelines via drag and drop | dev.to | 2021-07-17
The tool generates a nlphose command that can be executed in a docker container to run the pipeline. These pipelines can process streaming text like tweets or static data like files. They can be executed just like normal shell command using nlphose. Let me show you what I mean !
Python named-entity-recognition related posts
Upcoming App Announcement: Lemmatize, a Foreign Language Reader
2 projects | reddit.com/r/languagelearning | 11 Nov 2021
Is there as site tracking computer vision process?
1 project | reddit.com/r/computervision | 3 Nov 2021
How to create a dataset for training NER models when you only have entity data
1 project | reddit.com/r/LanguageTechnology | 18 Oct 2021
Preparing data for training NER models
1 project | reddit.com/r/LanguageTechnology | 11 Oct 2021
Training NER models for detecting custom entities
2 projects | reddit.com/r/LanguageTechnology | 8 Oct 2021
German POS Corpus for Commercial use
2 projects | reddit.com/r/LanguageTechnology | 5 Oct 2021
[P] NLP "tl;dr" Notes on Transformers
2 projects | reddit.com/r/MachineLearning | 12 Aug 2021
What are some of the best open-source named-entity-recognition projects in Python? This list will help you:
Are you hiring? Post a new remote job listing for free.