Natural Language Processing

Open-source projects categorized as Natural Language Processing Edit details

Top 23 Natural Language Processing Open-Source Projects

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: [D] The current and future state of AI/ML is shockingly demoralizing with little hope of redemption | reddit.com/r/MachineLearning | 2022-08-07

    pip install git+https://github.com/huggingface/transformers

  • funNLP

    中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具、国内电话号码正则匹配、清华大学XLORE:中英文跨语言百科知识图谱、清华大学人工智能技术系列报告、自然语言生成、NLU太难了系列、自动对联数据及机器人、用户名黑名单列表、罪名法务名词及分类模型、微信公众号语料、cs224n深度学习自然语言处理课程、中文手写汉字识别、中文自然语言处理 语料/数据集、变量命名神器、分词语料库+代码、任务型对话英文数据集、ASR 语音数据集 + 基于深度学习的中文

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • bert

    TensorFlow code and pre-trained models for BERT

    Project mention: Improved Content Understanding and Relevance with Large Language Models (SnooBERT ) | reddit.com/r/RedditEng | 2022-07-07

    BERT stands for Bidirectional Encoder Representations from Transformers. It is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right contexts in all layers. It generates state-of-the-art numerical representations that are useful for common language understanding tasks. You can find more details in the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. BERT is used today for popular Natural Language tasks like question answering, text prediction, text generation, summarization, and power applications like Google search.

  • Made-With-ML

    Learn how to responsibly deliver value with ML.

    Project mention: Where do I start to learn MLOPS? | reddit.com/r/mlops | 2022-08-10

    PS... +1 on Made With ML, the Hugging Face course is great, and I've heard to a ton of good things about MLOps Zoomcamp

  • Jieba

    结巴中文分词

    Project mention: Where can I download a database of Chinese word classifications (noun, verb, etc) | reddit.com/r/ChineseLanguage | 2022-03-28
  • HanLP

    Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification

    Project mention: Hanlp - Natural language processing for the next decade | reddit.com/r/github_trends | 2022-05-28
  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: How to get started with machine learning. | reddit.com/r/rust | 2022-08-09

    Given your need, I think you'll be better off with libraries like Spacy, which does NLP (rather than just DNN inference). You'll get your app much faster this way.

  • SonarLint

    Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.

  • NLP-progress

    Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

    Project mention: [D] How difficult/easy is to learn NLP once you have experience in a CV? | reddit.com/r/MachineLearning | 2021-12-13

    One thing is that NLP is a set of wildly different problems which share some aspects, but often use quite different techniques and assumptions about their datasets. So even if you would have NLP experience, if you'd need to start on a substantially different NLP task, you can't just apply what you know and succeed, you have to review "how things are done" for that problem domain. For a quick overview, sites like https://nlpprogress.com/ can be helpful to see what methods are used; and, perhaps even more importantly, how people are modeling the actual task.

  • applied-ml

    📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

    Project mention: Top Github repo trends in 2021 | dev.to | 2022-01-12

    The second repo I LOVE is Eugene Yan’s Applied ML repository. This is a brilliant idea to create and actually something I was planning on sort of casually doing in my non-existent free time… Anyhow, it is a curated list of technical posts from top engineering teams (Netflix, Amazon, Pinterest, Linkedin, etc.) detailing how they built out different types of AI/ML systems (e.g. forecasting, recommenders, search and ranking, etc.). Ofc, it focuses on AI/ML, but something similar could be made for the traditional or BI-oriented analytics stack, as well as the streaming world, super high value for practitioners! Btw-one of my favorite things at BCG used to be looking at our IT architecture team’s reference architecture diagrams… the best way to understand technologies is to look at how a ton of stuff is architected… and its fun!

  • rasa

    💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

    Project mention: Seek alternative to Wix chatbox. | reddit.com/r/webdev | 2022-05-28

    Check out Rasa

  • d2l-en

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 300 universities from 55 countries including Stanford, MIT, Harvard, and Cambridge.

    Project mention: How to pre-train BERT on different objective tasks using HuggingFace | reddit.com/r/deeplearning | 2022-04-10

    There might is bert library for pre-train bert model in huggingface, But I suggestion that you train bert model in native pytorch to understand detail, Limu's course is recommended for you

  • datasets

    🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

    Project mention: FauxPilot – an open-source GitHub Copilot server | news.ycombinator.com | 2022-08-02

    And then pass that my_code.json as the dataset name.

    [1] https://github.com/huggingface/datasets

  • awesome-nlp

    :book: A curated list of resources dedicated to Natural Language Processing (NLP)

    Project mention: There is framework for everything. | reddit.com/r/ProgrammerHumor | 2022-08-04
  • gensim

    Topic Modelling for Humans

    Project mention: sentence transformer vector dimensionality reduction to 1 | reddit.com/r/LanguageTechnology | 2022-08-01
  • Awesome-pytorch-list

    A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

    Project mention: Similar open source long library list to TF like Pytorch "ECOSYSTEM TOOLS" | reddit.com/r/tensorflow | 2021-11-19

    I got the following as recombination from elsewhere - https://github.com/jtoy/awesome-tensorflow and there is one for pt as well https://github.com/bharathgs/Awesome-pytorch-list . Thx for the help :D

  • flair

    A very simple framework for state-of-the-art Natural Language Processing (NLP)

    Project mention: Flair: A simple framework for state-of-the-art Natural Language Processing | news.ycombinator.com | 2022-04-11
  • allennlp

    An open-source NLP research library, built on PyTorch.

    Project mention: AllenNLP will be unmaintained in December | reddit.com/r/hypeurls | 2022-07-11
  • NLTK

    NLTK Source

    Project mention: There is framework for everything. | reddit.com/r/ProgrammerHumor | 2022-08-04
  • clip-as-service

    Embed images and sentences into fixed-length vectors with CLIP

    Project mention: Best models for sentence similarity with good benefit-cost ratio? | reddit.com/r/MLQuestions | 2022-08-08

    you could try Jina.ai's CLIP-as-a-Service: https://github.com/jina-ai/clip-as-service

  • Ciphey

    ⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡

    Project mention: How do I install Ciphey on Windows 10? | reddit.com/r/techsupport | 2022-07-12

    I followed the steps here . I am running Python 3.10 (64). When I try to install Ciphey using the instructions, on my cmd prompt I get the following:

  • deep-learning-drizzle

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

    Project mention: Consolidated Video lectures for Machine Learning(including DL, CV, NLP, etc) | reddit.com/r/developersIndia | 2022-01-22

    Also this as well for whoever needs it

  • natural

    general natural language facilities for node

    Project mention: Site that tracks player mentions in /r/FantasyPL (with sentiment) | reddit.com/r/FantasyPL | 2022-07-24

    I used this one: https://www.npmjs.com/package/natural

  • CoreNLP

    Stanford CoreNLP: A Java suite of core NLP tools.

    Project mention: How to use CoreNLP with a large corpus(14.7 GB)? | reddit.com/r/LanguageTechnology | 2022-08-06

    If you need further assistance, you will be better off making an issue on their github: https://github.com/stanfordnlp/CoreNLP

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-08-10.

Natural Language Processing related posts

Index

What are some of the best open-source Natural Language Processing projects? This list will help you:

Project Stars
1 transformers 68,093
2 funNLP 42,262
3 bert 31,863
4 Made-With-ML 30,534
5 Jieba 29,046
6 HanLP 26,580
7 spaCy 23,929
8 NLP-progress 20,738
9 applied-ml 20,615
10 rasa 14,661
11 d2l-en 14,476
12 datasets 13,902
13 awesome-nlp 13,525
14 gensim 13,430
15 Awesome-pytorch-list 13,282
16 flair 11,922
17 allennlp 11,142
18 NLTK 10,955
19 clip-as-service 10,557
20 Ciphey 10,464
21 deep-learning-drizzle 10,279
22 natural 9,868
23 CoreNLP 8,588
Find remote jobs at our new job board 99remotejobs.com. There are 3 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com