Python named-entity-recognition

Open-source Python projects categorized as named-entity-recognition | Edit details

Top 10 Python named-entity-recognition Projects

  • GitHub repo NLP-progress

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

    Project mention: Upcoming App Announcement: Lemmatize, a Foreign Language Reader | reddit.com/r/languagelearning | 2021-11-11

    A standard step in Chinese text processing is word segmentation, which deals with this problem.

  • GitHub repo flair

    A very simple framework for state-of-the-art Natural Language Processing (NLP)

    Project mention: How to create a dataset for training NER models when you only have entity data | reddit.com/r/LanguageTechnology | 2021-10-18

    We have a list of entities in text files separated with a new line. We intend to train the flair model to detect these entities in text, but NER models require the entity to be labeled in a paragraph with BOI format.

  • Scout APM

    Scout APM: A developer's best friend. Try free for 14-days. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.

  • GitHub repo Stanza

    Official Stanford NLP Python Library for Many Human Languages

    Project mention: Spacy vs NLTK for Spanish Language Statistical Tasks | reddit.com/r/LanguageTechnology | 2021-11-12
  • GitHub repo simpletransformers

    Transformers for Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI

    Project mention: Gpt 2 124m using transformers | reddit.com/r/LanguageTechnology | 2021-06-14

    https://github.com/ThilinaRajapakse/simpletransformers/blob/master/simpletransformers/language_generation/language_generation_model.py#L146

  • GitHub repo NCRFpp

    NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.

    Project mention: Speech and Language Processing (3rd ed. draft) | news.ycombinator.com | 2021-10-17

    They still talk about Hidden Markov Models (HMMs) in quite a bit of detail in the sequence labelling chapter, but you are quite right, Conditional Random Fields (CRFs) and especially neural network based CRFs are in the top rankings when it comes to named entity recognition (NER) and part-of-speech tagging (POS), e.g. see https://github.com/jiesutd/NCRFpp.

  • GitHub repo BERT-NER

    Pytorch-Named-Entity-Recognition-with-BERT

    Project mention: Training NER models for detecting custom entities | reddit.com/r/LanguageTechnology | 2021-10-08
  • GitHub repo seqeval

    A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)

    Project mention: Beginner questions about NER model evaluation. | reddit.com/r/LanguageTechnology | 2021-03-12

    . The standard way to evaluate NER (or any other sequence labelling problem) is to use the conlleval script (https://www.clips.uantwerpen.be/conll2000/chunking/output.html) or through the seqeval package in python (https://github.com/chakki-works/seqeval) . Either way, you need a list of predicted labels and a list of gold labels (see the code example in the link, it should be trivial to converse your output to the same data format).

  • Nanos

    Run Linux Software Faster and Safer than Linux with Unikernels.

  • GitHub repo BERTweet

    BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)

    Project mention: A pre-trained BERT-like model with recent events? | reddit.com/r/LanguageTechnology | 2021-06-24

    Not sure if that's what you are looking for, but BERTweet has model trained on tweets containing COVID keywords https://github.com/VinAIResearch/BERTweet

  • GitHub repo camel_tools

    A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

    Project mention: I have a problem in Arabic that I have no idea how to start solving. | reddit.com/r/LanguageTechnology | 2021-10-18

    The CaMeL library does tashkeel: https://github.com/CAMeL-Lab/camel_tools.

  • GitHub repo nlphose

    Enables creation of complex NLP pipelines in seconds, for processing static files or streaming text, using a set of simple command line tools. Perform multiple operation on text like NER, Sentiment Analysis, Chunking, Language Identification, Q&A, 0-shot Classification and more by executing a single command in the terminal

    Project mention: NlphoseBuilder : A tool to create NLP pipelines via drag and drop | dev.to | 2021-07-17

    The tool generates a nlphose command that can be executed in a docker container to run the pipeline. These pipelines can process streaming text like tweets or static data like files. They can be executed just like normal shell command using nlphose. Let me show you what I mean !

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-11-12.

Python named-entity-recognition related posts

Index

What are some of the best open-source named-entity-recognition projects in Python? This list will help you:

Project Stars
1 NLP-progress 19,449
2 flair 11,018
3 Stanza 5,846
4 simpletransformers 2,852
5 NCRFpp 1,754
6 BERT-NER 974
7 seqeval 658
8 BERTweet 391
9 camel_tools 180
10 nlphose 7
Find remote jobs at our new job board 99remotejobs.com. There are 32 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com