Python NLP

Open-source Python projects categorized as NLP

Top 23 Python NLP Projects

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: Las 10 Mejores Herramientas de Inteligencia Artificial de Código Abierto | dev.to | 2024-08-21

    [(https://dev-to-uploads.s3.amazonaws.com/uploads/articles/mwbeic3x9gtowahgunjl.png)](https://github.com/huggingface/transformers)

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • ailearning

    AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2

    Project mention: Top Github repositories for 10+ programming languages | dev.to | 2024-07-16

    Ai learning

  • bert

    TensorFlow code and pre-trained models for BERT

    Project mention: OpenAI Will Terminate Its Services in China: A Comprehensive Analysis | dev.to | 2024-06-25

    BERT

  • HanLP

    中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理

  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: 350M Tokens Don't Lie: Love and Hate in Hacker News | news.ycombinator.com | 2024-08-13

    Is this just using LLM to be cool? How does pure LLM with simple "In the scale between 0-10"" stack up against traditional, battle-tested sentiment analysis tools?

    Gemini suggests NLTK and spaCy

    https://www.nltk.org/

    https://spacy.io/

  • unilm

    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

    Project mention: A Picture Is Worth 170 Tokens: How Does GPT-4o Encode Images? | news.ycombinator.com | 2024-06-07

    Has anyone tried Kosmos [0] ? I came across it the other day and it looked shiny and interesting, but I haven't had a chance to put it to the test much yet.

    [0] - https://github.com/microsoft/unilm/tree/master/kosmos-2.5

  • datasets

    🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

    Project mention: Go is my hammer, and everything is a nail | news.ycombinator.com | 2024-08-12

    This is my (current) favorite list comprehension: https://github.com/huggingface/datasets/blob/871eabc7b23c27d... Someone was feeling awfully clever that day. (Not that I'm not occasionally guilty myself.)

  • InfluxDB

    Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.

    InfluxDB logo
  • rasa

    💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

    Project mention: 🔥🚀 Top 10 Open-Source Must-Have Tools for Crafting Your Own Chatbot 🤖💬 | dev.to | 2023-11-06

    Support Rasa on GitHub ⭐

  • Chinese-LLaMA-Alpaca

    中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

  • haystack

    :mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

    Project mention: The open source LLM framework Haystack is trending on GitHub | news.ycombinator.com | 2024-08-26
  • best-of-ml-python

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

    Project mention: Top Github repositories for 10+ programming languages | dev.to | 2024-07-16

    Best of ml python

  • ragflow

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

    Project mention: RAGFlow 0.9 is released to support GraphRAG end-to-end | news.ycombinator.com | 2024-08-06
  • gensim

    Topic Modelling for Humans

  • flair

    A very simple framework for state-of-the-art Natural Language Processing (NLP)

  • NLTK

    NLTK Source

    Project mention: NLTK version 3.8.2 is no longer available on PyPI | news.ycombinator.com | 2024-08-16
  • PaddleHub

    Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)

  • PaddleNLP

    👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

  • TextBlob

    Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

    Project mention: Using EvaDB to build AI-enhanced apps | dev.to | 2024-01-10

    TextBlob is a Python toolkit for text processing. It offers some common NLP functionalities such as part-of-speech tagging and noun phrase extraction. We’ll use TextBlob in our project to perform some quick sentiment analysis on tweets.

  • petals

    🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

    Project mention: How to Run Llama 3 405B on Home Devices? Build AI Cluster | news.ycombinator.com | 2024-07-29

    I love this. How does it compare to something like https://petals.dev/?

  • attention-is-all-you-need-pytorch

    A PyTorch implementation of the Transformer model in "Attention is All You Need".

    Project mention: ElevenLabs Launches Voice Translation Tool to Break Down Language Barriers | news.ycombinator.com | 2023-10-10

    The transformer model was invented to attend to context over the entire sequence length. Look at how the original authors used the Transformer for NMT in the original Vaswani et al publication. https://github.com/jadore801120/attention-is-all-you-need-py...

  • text-generation-inference

    Large Language Model Text Generation Inference

    Project mention: Best LLM Inference Engines and Servers to Deploy LLMs in Production | dev.to | 2024-06-05

    GitHub repository: https://github.com/huggingface/text-generation-inference

  • txtai

    💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

    Project mention: Embeddings index format for open data access | dev.to | 2024-09-06

    txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

  • GPT2-Chinese

    Chinese version of GPT2 training code, using BERT tokenizer.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python NLP discussion

Log in or Post with

Python NLP related posts

Index

What are some of the best open-source NLP projects in Python? This list will help you:

Project Stars
1 transformers 131,636
2 ailearning 39,022
3 bert 37,771
4 HanLP 33,394
5 spaCy 29,584
6 unilm 19,475
7 datasets 18,968
8 rasa 18,583
9 Chinese-LLaMA-Alpaca 18,133
10 haystack 16,539
11 best-of-ml-python 16,232
12 ragflow 16,088
13 gensim 15,546
14 flair 13,803
15 NLTK 13,400
16 PaddleHub 12,667
17 PaddleNLP 11,933
18 TextBlob 9,074
19 petals 9,049
20 attention-is-all-you-need-pytorch 8,718
21 text-generation-inference 8,666
22 txtai 8,652
23 GPT2-Chinese 7,438

Sponsored
Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com