article-extraction-benchmark
dom-distiller
article-extraction-benchmark | dom-distiller | |
---|---|---|
1 | 3 | |
242 | 594 | |
5.8% | - | |
0.0 | 0.0 | |
almost 3 years ago | over 2 years ago | |
Python | Java | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
article-extraction-benchmark
dom-distiller
- How does Firefox's Reader View work?
- The most underused browser feature
-
An app like Pocket to read articles and highlight?
The one ask you have that Literal doesn't yet support is read mode for sources (though it will automatically archive / backup sources). It looks like Chrome's read mode (i.e. the "Show simplified view" toolbar) is open source, so I think I could add support relatively quickly if you're interested.
What are some alternatives?
unclutter - A modern reader mode and article library for your browser.
readability - Readability is a library written in Go (golang) to parse, analyze and convert HTML pages into readable content. Originally an Arc90 Experiment, it is now incorporated into Safari’s Reader View.
go-domdistiller - Go-DomDistiller is a Go port of the DOM Distiller library which implements Reader mode in Chrome for Android and Desktop. It has no dependencies on Chromium and is meant to run as a command line program or on a server.
ftr-site-config - Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.
go-dateparser - go parser for human readable dates ported from the dateparser python package
parser - 📜 Extract meaningful content from the chaos of a web page
go-trafilatura - go-trafilatura is a Go port of the trafilatura Python library.
arc90-readability - A copy of the original Arc90 repo with links to many of the current ports.
soup-strainer - A reimplementation of the Readability/Decruft algorithm using BeautifulSoup and html5lib
go-htmldate - CLI and Go package for extracting publication date of a web pages.
einkbro - A small, fast web browser based on Android WebView. It's tailored for E-Ink devices but also works great on normal android devices.