SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Summarization Projects
-
haystack
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
Are you curious about the NLP/GenAI/RAG framework for developers? Check out my opinionated developer review of Haystack, which emerges as a robust NLP/RAG framework that excels in search and retrieval applications: Read the review.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
-
-
Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07
-
-
You can explore and contribute to this project on GitHub: ollama-ebook-summary.
-
simpleT5
simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.
-
summarizepaper
An AI-powered arXiv paper summarization website with a virtual assistant for answering questions.
-
CX_DB8
a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sentence Encoder, Flair)
I was working on this stuff before it was cool, so in the sense of the precursor to LLMs (and sometimes supporting LLMs still) I've built many things:
1. Games you can play with word2vec or related models (could be drop in replaced with sentence transformer). It's crazy that this is 5 years old now: https://github.com/Hellisotherpeople/Language-games
2. "Constrained Text Generation Studio" - A research project I wrote when I was trying to solve LLM's inability to follow syntactic, phonetic, or semantic constraints: https://github.com/Hellisotherpeople/Constrained-Text-Genera...
3. DebateKG - A bunch of "Semantic Knowledge Graphs" built on my pet debate evidence dataset (LLM backed embeddings indexes synchronized with a graphDB and a sqlDB via txtai). Can create compelling policy debate cases https://github.com/Hellisotherpeople/DebateKG
4. My failed attempt at a good extractive summarizer. My life work is dedicated to one day solving the problems I tried to fix with this project: https://github.com/Hellisotherpeople/CX_DB8
-
-
-
BooookScore
A package to generate summaries of long-form text and evaluate the coherence of these summaries. Official package for our ICLR 2024 paper, "BooookScore: A systematic exploration of book-length summarization in the era of LLMs".
Project mention: Evaluating faithfulness and content selection of LLMs in book-length summaries | news.ycombinator.com | 2024-04-09With a link to https://arxiv.org/pdf/2310.00785.pdf - which then links to another GitHub repository, https://github.com/lilakk/BooookScore which has a bunch of prompts in https://github.com/lilakk/BooookScore/tree/main/prompts
Which makes me think that this original paper isn't evaluating LLMs so much as it's evaluating that one particular prompting technique for long summaries.
Gemini Pro 1.5 has 1m token context length, which should remove the need for weird hierarchical summary tricks. I wonder how well it would score?
-
easy-web-summarizer
Summarize webpages from specified URLs using the LangChain framework and the ChatOllama model
Project mention: Show HN: Easy Webpage Summarizer – Quickly Summarize Webpages and YouTube Videos | news.ycombinator.com | 2024-05-01 -
-
-
-
Auto-Research
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!
-
SelSum
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.
-
Text-Summarization-using-NLP
Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization
-
-
Teams-Transcript-Summarizer
Summarizes .vtt transcripts from teams meeting using langchain and GPT-4.
-
Python Summarization discussion
Python Summarization related posts
-
How critical theory is radicalizing high school debate
-
Copy is all you need
-
Transcribe YouTube Videos in Bulk
-
Access machine learning models from different cloud providers
-
The only Python SDK for accessing machine learning models from multiple providers.
-
Converse with book – Built with GPT-3
-
Targeted Summarization - A tool for information extraction
-
A note from our sponsor - SaaSHub
www.saashub.com | 5 Dec 2024
Index
What are some of the best open-source Summarization projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | haystack | 17,908 |
2 | sumy | 3,531 |
3 | RL4LMs | 2,221 |
4 | pytextrank | 2,152 |
5 | LLM-Finetuning-Toolkit | 785 |
6 | dr-doc-search | 601 |
7 | ollama-ebook-summary | 418 |
8 | simpleT5 | 388 |
9 | summarizepaper | 254 |
10 | CX_DB8 | 226 |
11 | textsum | 128 |
12 | factsumm | 109 |
13 | BooookScore | 109 |
14 | easy-web-summarizer | 103 |
15 | ctc-gen-eval | 93 |
16 | targetedSummarization | 87 |
17 | summarizers | 78 |
18 | Auto-Research | 55 |
19 | SelSum | 45 |
20 | Text-Summarization-using-NLP | 40 |
21 | bert2bert-summarization | 31 |
22 | Teams-Transcript-Summarizer | 26 |
23 | tldwol | 23 |