SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python text-classification Projects
-
HanLP
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Resume-Matcher
Resume Matcher is an open source, free tool to improve your resume. It works by using language models to compare and rank resumes with job descriptions.
-
simpletransformers
Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
-
obsei
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .
-
nlu
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
-
happy-transformer
Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.
-
PIXIU
This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Step by step guide to create customized chatbot by using spaCy (Python NLP library) | dev.to | 2024-03-23Hi Community, In this article, I will demonstrate below steps to create your own chatbot by using spaCy (spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython):
GitHub: https://github.com/srbhr/Resume-Matcher Website: https://www.resumematcher.fyi/ Discord: Resume Matcher's Discord Tech Stack: Python, NextJS, FastAPI, TypeScript
Project mention: Instance segmentation of small objects in grainy drone imagery | /r/computervision | 2023-12-09
Project mention: My experience on starting with fine tuning LLMs with custom data | /r/LocalLLaMA | 2023-07-10If you li embeddings and vector DB, you should look into this: https://github.com/HKUNLP/instructor-embedding
RAG is very difficult to do right. I am experimenting with various RAG projects from [1]. The main problems are:
- Chunking can interfer with context boundaries
- Content vectors can differ vastly from question vectors, for this you have to use hypothetical embeddings (they generate artificial questions and store them)
- Instead of saving just one embedding per text-chuck you should store various (text chunk, hypothetical embedding questions, meta data)
- RAG will miserably fail with requests like "summarize the whole document"
- to my knowledge, openAI embeddings aren't performing well, use a embedding that is optimized for question answering or information retrieval and supports multi language. Also look into instructor embeddings: https://github.com/embeddings-benchmark/mteb
1 https://github.com/underlines/awesome-marketing-datascience/...
Project mention: Small-Text: Looking for Contributors (Active Learning, Text Classification, NLP) | /r/LanguageTechnology | 2023-05-21
Python text-classification related posts
-
Which UI library for react or next are you using in your project?
-
Resume Matcher: Free Open Source Python Based ATS with ML
-
My personal project Resume Matcher is trending on GitHub with 500+ stars. Thank you 🙏 for this!
-
Resume Matcher – Free Open Source ATS Tool to Match Resumes to Job Descriptions
-
Show HN: I made an open-source Resume Matcher. A Python based ATS with ML
-
I've made a customisable SMS personal assistant which has infinite and persistent semantic memory.
-
Small-Text: Looking for Contributors (Active Learning, Text Classification, NLP)
-
A note from our sponsor - SaaSHub
www.saashub.com | 4 May 2024
Index
What are some of the best open-source text-classification projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | HanLP | 32,388 |
2 | spaCy | 28,751 |
3 | Resume-Matcher | 4,534 |
4 | text-classification-cnn-rnn | 4,078 |
5 | simpletransformers | 3,984 |
6 | catalyst | 3,227 |
7 | instructor-embedding | 1,703 |
8 | eda_nlp | 1,536 |
9 | mteb | 1,395 |
10 | refinery | 1,365 |
11 | text_gcn | 1,326 |
12 | chatgpt-comparison-detection | 1,191 |
13 | obsei | 1,079 |
14 | spacy-llm | 945 |
15 | nlu | 809 |
16 | BERTweet | 558 |
17 | small-text | 520 |
18 | happy-transformer | 500 |
19 | TextFooler | 465 |
20 | PIXIU | 399 |
21 | hover | 314 |
22 | kiri | 240 |
23 | VDCNN | 171 |
Sponsored