Readability4J
readable
Readability4J | readable | |
---|---|---|
3 | 3 | |
135 | 78 | |
- | - | |
4.3 | 4.7 | |
over 2 years ago | about 2 months ago | |
HTML | HTML | |
Apache License 2.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Readability4J
-
Creating an advanced search engine with PostgreSQL
Depending upon the type of content, one might want to look into using the Readability (Browder's reader view) to parse the webpage. It will give you all the useful info without the junk. Then you can put it in the DB as needed.
https://github.com/mozilla/readability
Btw, readability, is also available in few other languages like Kotlin:
https://github.com/dankito/Readability4J
-
How does Firefox's Reader View work?
My Hacker News client HACK for iOS and Android has a reader mode ability browser. While on iOS, I was able to use the reader mode feature provided by SFSafariViewController, that wasn't available on android.
So I had to read a ton about this. I ended up using a heavily modified Kotlin version of Readability:
https://github.com/dankito/Readability4J
https://play.google.com/store/apps/details?id=com.pranapps.h...
https://apps.apple.com/us/app/id1464477788
-
Show HN: Instantly Listen to Any URL
Not sure about OP but I just implemented this in my Hacker News android client (thanks for the idea OP).
This is how I implemented it. I had already achieved article to "reader mode" by heavily customizing the Kotlin port of Mozilla‘s Readability:
https://github.com/dankito/Readability4J
Then I pass the text via Android's TextToSpeech library and it works very well:
fun trySpeaking(str:String){
readable
-
Readable: A service for reading long-form content on any device
Original author here. Thanks for the submission. I built the service mostly for myself because I have an old Kobo eBook reader and the browser is pretty basic. This works very well, though.
Only after building it I realized that there could be more use-cases; for example when traveling and working from a bad connection. The pages are usually much more lightweight than the original ones and don't contain any trackers.
I don't even have metrics for that service. No idea how many people are using it and I don't care for the most part. That said, if anyone is interested in using the service you can run your own instance and be completely independent.
The code is here BTW: https://github.com/readable-app/readable.
-
A Reader Mode Proxy for the Slow Web, Deployed on shuttle.rs
Did you mean the meta headers? Any example in particular that I could look at? We're in the process of testing different readability libraries. Here's a discussion on what we saw so far: https://github.com/readable-app/readable/issues/2 Maybe you can add a comment there so we don't forget to check your use-case.
What are some alternatives?
go-readability - Go package that cleans a HTML page for better readability.
article-extractor - To extract main article from given URL with Node.js
asciidoctor-html5s - Semantic HTML5 converter (backend) for Asciidoctor
Just-Read - A customizable read mode web extension.
mmark - Strict markdown processor for writers
percollate - A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.
itext-pdfhtml-dotnet - pdfHTML is an iText add-on for C# (.NET) that allows you to easily convert HTML and CSS into standards compliant PDFs that are accessible, searchable and usable for indexing.
web-clipper - For Notion,OneNote,Bear,Yuque,Joplin。Clip anything to anywhere
unclutter - A modern reader mode and article library for your browser.
go-domdistiller - Go-DomDistiller is a Go port of the DOM Distiller library which implements Reader mode in Chrome for Android and Desktop. It has no dependencies on Chromium and is meant to run as a command line program or on a server.
knowledge - A knowledge daemon to collect ideas and auto organize them, with SQLite