Readability

Open-source projects categorized as Readability | Edit details

Top 23 Readability Open-Source Projects

  • GitHub repo percollate

    A command-line tool to turn web pages into beautiful, readable PDF, EPUB, or HTML docs.

    Project mention: Alternatives to ArchiveBox? | reddit.com/r/selfhosted | 2021-10-30

    Maybe https://github.com/danburzo/percollate, I didnt try it and I am not sure if the html output looks like u want it.

  • GitHub repo web-clipper

    For Notion,OneNote,Bear,Yuque,Joplin。Clip anything to anywhere

    Project mention: Has anyone tried using a non-OneNote webclipper/3rd Party Webclipper with OneNote? | reddit.com/r/OneNote | 2021-08-06

    For third party clipper, I installed this one before: https://github.com/webclipper/web-clipper, it looks promising, but it could not log into my OneNote account, then I give it up.

  • Scout APM

    Scout APM: A developer's best friend. Try free for 14-days. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.

  • GitHub repo Just-Read

    A customizable read mode web extension.

    Project mention: The most underused browser feature | news.ycombinator.com | 2021-08-25

    It uses a pretty simple text selection algorithm I've developed through trial and error: https://github.com/ZachSaucier/Just-Read/blob/6dcb4f05b93287...

    I don't know how it compares to Readability.js.

  • GitHub repo stylebot

    Change the appearance of the web instantly

    Project mention: Stack Overflow AMOLED theme using External CSS 💯 | reddit.com/r/web_design | 2021-11-15

    With Stylebot for example

  • GitHub repo textstat

    :memo: python package to calculate readability statistics of a text object - paragraphs, sentences, articles.

    Project mention: Question on easing comprehension | dev.to | 2021-09-15

    Python https://github.com/shivam5992/textstat

  • GitHub repo Midnight-Lizard

    Сustom color schemes for all websites

    Project mention: czy znacie jakieś mało popularne stronki/aplikacje które są pożyteczne? | reddit.com/r/Polska | 2021-10-07
  • GitHub repo go-readability

    Go package that cleans a HTML page for better readability.

    Project mention: Show HN: Forlater.email – an email-based bookmarking service | news.ycombinator.com | 2021-09-19

    I'm using https://github.com/go-shiori/go-readability -- a Go re-implementation of Mozilla's readability-js library. It does a pretty good job.

  • Nanos

    Run Linux Software Faster and Safer than Linux with Unikernels.

  • GitHub repo code-review-checklist

    This code review checklist helps you be a more effective and efficient code reviewer.

    Project mention: Developer checklist | reddit.com/r/ExperiencedDevs | 2021-07-05

    There is this checklist.

  • GitHub repo trafilatura

    Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

    Project mention: What's something self hosted everyone needs to run ? | reddit.com/r/selfhosted | 2021-09-02

    The only RSS-reader(not an aggregator like freshrss) with a parser included i know of is RSSowlNix (which still works but it's a bit old/not very leightweight). I wrote myself a little dirty workaround in python to solve this problem (since im a noob in PHP). It reads the sqlite database of freshrss, adds a new column to every article to mark articles, that are already parsed and then i use trafilatura to give fulltext parsing a try for every newly added article. To be on the safe site i use it with a proxy. Not all pages work (some javascript gdpr popups are a nightmare), and you cannot jump over paywalls but it works most of the times.

  • GitHub repo scrape

    Scrape any website, article or RSS/Atom Feed with ease!

  • GitHub repo CSharpForMarkup

    Use declarative style C# instead of XAML for Xamarin Forms UI

    Project mention: Performance Improvements in .NET 6 | news.ycombinator.com | 2021-08-19

    To manage mental mapping of what is being drawn, I keep methods that create widget trees short. My rule of thumb is that whole method has to comfortably fit the screen at once. For each section of the main tree I create a static function that returns a branch of the tree. These functions have descriptive names that help you visualize what element each function builds. If a tree inside a function is long, it is broken down in the same way.

    There are some fluent extensions (for Xamarin.Forms and probably future MAUI) that help you build UI in declarative fashion with C#. Same extensions could be created for other frameworks.

    https://devblogs.microsoft.com/xamarin/c-sharp-markup-for-xa...

    https://github.com/VincentH-Net/CSharpForMarkup

    For reusable custom widgets that can't be done with a static function, I create new classes with their own widget trees. Try to keep widgets composable and avoid inheritance if possible.

    Hot reload is coming in .NET 6, so waiting for rebuild will soon be history.

    I have no experience with QML so I can't really comment on that.

  • GitHub repo article-parser

    To extract main article from given URL with Node.js

    Project mention: How to get the main topic of a Web article? | reddit.com/r/node | 2021-02-14
  • GitHub repo readability

    Readability is Elixir library for extracting and curating articles. (by keepcosmos)

  • GitHub repo Cadmium

    Natural Language Processing (NLP) library for Crystal

  • GitHub repo mercury_fulltext

    📖 Enjoy full text for tt-rss.

    Project mention: Journalist: A RSS aggregator that speaks the Fever API | reddit.com/r/selfhosted | 2021-01-10

    I've kept Feediron as a manual method mainly because it's very hard to determine if the stub is useful content or not. A more automatic method is the mercury_fulltext plugin that uses the mercury-parser-api to automatically fetch the full article text

  • GitHub repo paperoni

    An article extractor in Rust

    Project mention: Paperoni 0.6.0 release | reddit.com/r/rust | 2021-07-24

    Hello r/rust, I've released v0.6.0-alpha1 of Paperoni today. Paperoni is an article downloader that can download web articles into EPUB files. This current release also allows you to export articles as HTML files which opens up the possibilities for exporting to PDF. This was a feature requested about 3 months ago when I first posted about this project. Feel free to check it out and give any feedback. Thanks!

  • GitHub repo sspipe

    Simple Smart Pipe: python productivity-tool for rapid data manipulation

    Project mention: The "Connector" in main function? | reddit.com/r/learnpython | 2021-08-19
  • GitHub repo SAPC-APCA

    SAPC (S-LUV Advanced Predictive Color) is an accessibility appearance model which spawned APCA (Advanced Perceptual Contrast Algorithm) for use in emerging web standards in determining readability contrast.

    Project mention: I made a tool to check the contrast between two colours using the new contrast algorithm being drafted for WCAG 3 — also includes fun rgb/hsl/lch sliders and buttons to adjust either colour to a desired contrast! | reddit.com/r/web_design | 2021-10-27

    Andrew Somers W3 AGWG Invited Expert Color Science Researcher Myndex Perception Research Inventor APCA/co-author WCAG3

  • GitHub repo retext-readability

    plugin to check readability

    Project mention: Question on easing comprehension | dev.to | 2021-09-15
  • GitHub repo Readability4J

    A Kotlin port of Mozilla‘s Readability. It extracts a website‘s relevant content and removes all clutter from it.

    Project mention: Show HN: Instantly Listen to Any URL | news.ycombinator.com | 2021-08-13

    Not sure about OP but I just implemented this in my Hacker News android client (thanks for the idea OP).

    This is how I implemented it. I had already achieved article to "reader mode" by heavily customizing the Kotlin port of Mozilla‘s Readability:

    https://github.com/dankito/Readability4J

    Then I pass the text via Android's TextToSpeech library and it works very well:

        fun trySpeaking(str:String){

  • GitHub repo Neural-Scam-Artist

    Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.

    Project mention: [self-promotion] A dataset of email scams (Github Link) | reddit.com/r/datasets | 2021-10-30
  • GitHub repo validate-access

    Parse a & Validate a given directory with multiple entries

    Project mention: Utility to Parse and Validate Directory with Multiple Entries | reddit.com/r/node | 2021-03-02
  • GitHub repo attest

    A small library to make go tests more readable.

    Project mention: Is there a process for preparing a library for Go 2 compatibility? | reddit.com/r/golang | 2021-04-09

    I have a library which would greatly benefit from the generics feature of Go 2. Is there a way that I can develop and test against the current development version of Go 2, or is it too early?

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2021-11-15.

Readability related posts

Index

What are some of the best open-source Readability projects? This list will help you:

Project Stars
1 percollate 3,529
2 web-clipper 3,461
3 Just-Read 942
4 stylebot 909
5 textstat 730
6 Midnight-Lizard 474
7 go-readability 345
8 code-review-checklist 343
9 trafilatura 316
10 scrape 314
11 CSharpForMarkup 311
12 article-parser 250
13 readability 191
14 Cadmium 181
15 mercury_fulltext 135
16 paperoni 113
17 sspipe 110
18 SAPC-APCA 95
19 retext-readability 68
20 Readability4J 62
21 Neural-Scam-Artist 15
22 validate-access 2
23 attest 0
Find remote jobs at our new job board 99remotejobs.com. There are 34 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com