Python nlp-library

Open-source Python projects categorized as nlp-library Edit details

Top 17 Python nlp-library Projects

  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Project mention: python connects to amazonaws during diffusion | reddit.com/r/StableDiffusion | 2022-09-02

    There is a lot of external references to their domain in their code. For instance : https://github.com/huggingface/transformers/blob/main/src/transformers/models/bert/tokenization_bert_fast.py

  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: What does Spacy use for Documentation? | reddit.com/r/opensource | 2022-09-05

    The readme here should provide some clarity.

  • SonarLint

    Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.

  • OpenPrompt

    An Open-Source Framework for Prompt-Learning.

    Project mention: OpenPrompt: An Open-Source Toolkit for Prompt-Learning | news.ycombinator.com | 2021-10-15
  • FARM

    :house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

  • tika-python

    Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

    Project mention: Document Parsing - an unsolved problem? | reddit.com/r/LanguageTechnology | 2022-07-19

    At my previous job we had the same problem which we solved by using Tika. We called it on the server along with other stuff, but there is also a Python binding.

  • contextualized-topic-models

    A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.

    Project mention: Extract words from large data set of reviews by sentiment | reddit.com/r/MLQuestions | 2022-05-23

    Use CTM https://github.com/MilaNLProc/contextualized-topic-models with sentiment labels to built distribution of words over labels

  • skweak

    skweak: A software toolkit for weak supervision applied to NLP tasks

    Project mention: [P] Programmatic: Powerful Weak Labeling | reddit.com/r/MachineLearning | 2022-04-20

    Code for https://arxiv.org/abs/2104.09683 found: https://github.com/NorskRegnesentral/skweak

  • Scout APM

    Truly a developer’s best friend. Scout APM is great for developers who want to find and fix performance issues in their applications. With Scout, we'll take care of the bugs so you can focus on building great things 🚀.

  • pythainlp

    Thai Natural Language Processing in Python.

  • OCTIS

    OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

    Project mention: Interpretation of topic modeling results between LDA and BERTopic | reddit.com/r/LanguageTechnology | 2022-09-18

    OCTIS

  • SudachiPy

    Python version of Sudachi, a Japanese tokenizer.

    Project mention: Sakubun - a tool I made to help you practice kanji, with customized quiz questions and sentences | reddit.com/r/LearnJapanese | 2022-09-01

    The current readings were generated with SudachiPy, with a little processing. UniDic seems pretty interesting, I'll check it out. Do you know how well its accuracy is, compared to Sudachi?

  • camel_tools

    A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

    Project mention: I have a problem in Arabic that I have no idea how to start solving. | reddit.com/r/LanguageTechnology | 2021-10-18

    The CaMeL library does tashkeel: https://github.com/CAMeL-Lab/camel_tools.

  • turkish-deasciifier

    Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs

  • mutate

    A library to synthesize text datasets using Large Language Models (LLM)

    Project mention: Show HN: Mutate – A library to synthesize text datasets using Large LMs | news.ycombinator.com | 2022-03-01
  • toiro

    A comparison tool of Japanese tokenizers

  • mlconjug3

    A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

  • taxonomy4good

    Taxonomy4Good: a sustainability lexicon that provides the freedom to create custom taxonomies in addition to listed taxonomies.

    Project mention: Sustainable Data | reddit.com/r/SustainableData | 2022-08-21

    Check out Good Data Hub's taxonomy4good for free access to a open-source sustainability lexicon https://github.com/GoodDataHub/taxonomy4good/tree/master/taxonomy4good/taxonomies

  • breame

    Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

  • talent.io

    Download talent.io’s Tech Salary Report. Median salaries, most in-demand technologies, state of the remote work... all you need to know your worth on the market by tech recruitment platform talent.io

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-09-18.

Python nlp-library related posts

Index

What are some of the best open-source nlp-library projects in Python? This list will help you:

Project Stars
1 transformers 70,618
2 spaCy 24,240
3 OpenPrompt 1,916
4 FARM 1,574
5 tika-python 1,199
6 contextualized-topic-models 929
7 skweak 821
8 pythainlp 740
9 OCTIS 428
10 SudachiPy 313
11 camel_tools 247
12 turkish-deasciifier 128
13 mutate 103
14 toiro 103
15 mlconjug3 44
16 taxonomy4good 9
17 breame 7
Find remote jobs at our new job board 99remotejobs.com. There are 5 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Download talent.io’s Tech Salary Report
Median salaries, most in-demand technologies, state of the remote work... all you need to know your worth on the market by tech recruitment platform talent.io
www.talent.io