The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 23 Ner Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
rust-bert
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
-
NCRFpp
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
-
FARM
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
-
Recognizers-Text
Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time, etc. in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI, NL. Partial support for JA, KO, AR, SV). Packages available at: https://www.nuget.org/profiles/Recognizers.Text, https://www.npmjs.com/~recognizers.text
-
entity-recognition-datasets
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
ner-annotator
Named Entity Recognition (NER) Annotation tool for SpaCy. Generates Traning Data as a JSON which can be readily used.
-
CogCompNLP
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
-
genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
-
DataTurks
ML data annotations made super easy for teams. Just upload data, add your team and build training/evaluation dataset in hours.
-
concise-concepts
This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
-
embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Would this method work to increase the memory of the model? Saving summaries generated by a 2nd model and injecting them depending on the current topic. | /r/LocalLLaMA | 2023-06-09
Project mention: How to leverage the state-of-the-art NLP models in Rust | /r/infinilabs | 2023-06-07brew install libtorch brew link libtorch brew ls --verbose libtorch | grep dylib export LIBTORCH=$(brew --cellar pytorch)/$(brew info --json pytorch | jq -r '.[0].installed[0].version') export LD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH git clone https://github.com/guillaume-be/rust-bert.git cd rust-bert ORT_STRATEGY=system cargo run --example sentence_embeddings
There is of course the list at https://github.com/juand-r/entity-recognition-datasets, but all of the recent English datasets cover other domains of English, such as the music NER, space NER, etc. All interesting things, but not 2020s English newswire.
Project mention: Show HN: Next-token prediction in JavaScript – build fast LLMs from scratch | news.ycombinator.com | 2024-04-10This is awesome, thanks. I've been messing with wink's NLP library (https://winkjs.org/wink-nlp/) to transform user queries and format responses so I can make a proper chat bot - will see what I can learn from these!
Project mention: A transformer-based method for zero and few-shot biomedical NER | news.ycombinator.com | 2023-05-12
Hey HN! I'm super excited to share Markup with you, which is a totally free & open-source annotation tool that helps you transform unstructured text (e.g. news articles) into structured data that you can use for building, training, or fine-tuning ML models!
Check it out: https://github.com/samueldobbie/markup
Ner related posts
- Recent English newswire NER datasets?
- A transformer-based method for zero and few-shot biomedical NER
- Towards a Tagalog NLP pipeline: Building a spaCy model from scratch
- Any large manually annotated NER datasets?
- Speech and Language Processing (3rd ed. draft)
- Microsoft Unveils Genalog: An Open Source, AI Cross-Platform Python Package For Generating Document Images With Synthetic Noise
- [P] German NER on Legal Data using BERT
-
A note from our sponsor - WorkOS
workos.com | 28 Apr 2024
Index
What are some of the best open-source Ner projects? This list will help you:
Project | Stars | |
---|---|---|
1 | Chinese-Names-Corpus | 3,824 |
2 | DeepKE | 2,929 |
3 | rust-bert | 2,418 |
4 | NCRFpp | 1,877 |
5 | FARM | 1,723 |
6 | Recognizers-Text | 1,645 |
7 | entity-recognition-datasets | 1,431 |
8 | wink-nlp | 1,143 |
9 | TencentPretrain | 975 |
10 | BERTweet | 557 |
11 | ner-annotator | 504 |
12 | CogCompNLP | 469 |
13 | malaya | 456 |
14 | MedCAT | 408 |
15 | zshot | 319 |
16 | genalog | 295 |
17 | bert-sklearn | 293 |
18 | DataTurks | 255 |
19 | concise-concepts | 240 |
20 | markup | 231 |
21 | huspacy | 148 |
22 | gpt-graph | 63 |
23 | embedders | 21 |
Sponsored