Python Summarization

Open-source Python projects categorized as Summarization

Top 23 Python Summarization Projects

Summarization
  • haystack

    AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

    Project mention: AI Engineer's Tool Review: Haystack | dev.to | 2024-12-03

    Are you curious about the NLP/GenAI/RAG framework for developers? Check out my opinionated developer review of Haystack, which emerges as a robust NLP/RAG framework that excels in search and retrieval applications: Read the review.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • sumy

    Module for automatic summarization of text documents and HTML pages.

  • RL4LMs

    A modular RL library to fine-tune language models to human preferences

  • pytextrank

    Python implementation of TextRank algorithms ("textgraphs") for phrase extraction

  • LLM-Finetuning-Toolkit

    Toolkit for fine-tuning, ablating and unit-testing open-source LLMs.

    Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07
  • ollama-ebook-summary

    LLM for Long Text Summary (Comprehensive Bulleted Notes)

    Project mention: Ollama eBook Summary: A Different Way to Chat with PDF | dev.to | 2024-08-18

    You can explore and contribute to this project on GitHub: ollama-ebook-summary.

  • simpleT5

    simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

  • summarizepaper

    An AI-powered arXiv paper summarization website with a virtual assistant for answering questions.

  • CX_DB8

    a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sentence Encoder, Flair)

    Project mention: Ask HN: What have you built with LLMs? | news.ycombinator.com | 2024-02-05

    I was working on this stuff before it was cool, so in the sense of the precursor to LLMs (and sometimes supporting LLMs still) I've built many things:

    1. Games you can play with word2vec or related models (could be drop in replaced with sentence transformer). It's crazy that this is 5 years old now: https://github.com/Hellisotherpeople/Language-games

    2. "Constrained Text Generation Studio" - A research project I wrote when I was trying to solve LLM's inability to follow syntactic, phonetic, or semantic constraints: https://github.com/Hellisotherpeople/Constrained-Text-Genera...

    3. DebateKG - A bunch of "Semantic Knowledge Graphs" built on my pet debate evidence dataset (LLM backed embeddings indexes synchronized with a graphDB and a sqlDB via txtai). Can create compelling policy debate cases https://github.com/Hellisotherpeople/DebateKG

    4. My failed attempt at a good extractive summarizer. My life work is dedicated to one day solving the problems I tried to fix with this project: https://github.com/Hellisotherpeople/CX_DB8

  • textsum

    CLI & Python API to easily summarize text-based files with transformers

  • factsumm

    FactSumm: Factual Consistency Scorer for Abstractive Summarization

  • BooookScore

    A package to generate summaries of long-form text and evaluate the coherence of these summaries. Official package for our ICLR 2024 paper, "BooookScore: A systematic exploration of book-length summarization in the era of LLMs".

    Project mention: Evaluating faithfulness and content selection of LLMs in book-length summaries | news.ycombinator.com | 2024-04-09

    With a link to https://arxiv.org/pdf/2310.00785.pdf - which then links to another GitHub repository, https://github.com/lilakk/BooookScore which has a bunch of prompts in https://github.com/lilakk/BooookScore/tree/main/prompts

    Which makes me think that this original paper isn't evaluating LLMs so much as it's evaluating that one particular prompting technique for long summaries.

    Gemini Pro 1.5 has 1m token context length, which should remove the need for weird hierarchical summary tricks. I wonder how well it would score?

  • easy-web-summarizer

    Summarize webpages from specified URLs using the LangChain framework and the ChatOllama model

    Project mention: Show HN: Easy Webpage Summarizer – Quickly Summarize Webpages and YouTube Videos | news.ycombinator.com | 2024-05-01
  • ctc-gen-eval

    EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation

  • targetedSummarization

    TextReducer - A Tool for Summarization and Information Extraction

  • summarizers

    Package for controllable summarization

  • Auto-Research

    Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

  • SelSum

    Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

  • Text-Summarization-using-NLP

    Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

  • bert2bert-summarization

    Abstractive summarization using Bert2Bert framework.

  • Teams-Transcript-Summarizer

    Summarizes .vtt transcripts from teams meeting using langchain and GPT-4.

  • tldwol

    Web API that summarizes multimedia from various sources using modern AI tools.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Summarization discussion

Log in or Post with

Python Summarization related posts

Index

What are some of the best open-source Summarization projects in Python? This list will help you:

Project Stars
1 haystack 17,908
2 sumy 3,531
3 RL4LMs 2,221
4 pytextrank 2,152
5 LLM-Finetuning-Toolkit 785
6 dr-doc-search 601
7 ollama-ebook-summary 418
8 simpleT5 388
9 summarizepaper 254
10 CX_DB8 226
11 textsum 128
12 factsumm 109
13 BooookScore 109
14 easy-web-summarizer 103
15 ctc-gen-eval 93
16 targetedSummarization 87
17 summarizers 78
18 Auto-Research 55
19 SelSum 45
20 Text-Summarization-using-NLP 40
21 bert2bert-summarization 31
22 Teams-Transcript-Summarizer 26
23 tldwol 23

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you konow that Python is
the 1st most popular programming language
based on number of metions?