Readability4J

A Kotlin port of Mozilla‘s Readability. It extracts a website‘s relevant content and removes all clutter from it. (by dankito)

Readability4J Alternatives

Similar projects and alternatives to Readability4J

  1. Typesense

    Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. readability

    A standalone version of the readability lib

  4. unclutter

    39 Readability4J VS unclutter

    A modern reader mode and article library for your browser.

  5. zombodb

    Making Postgres and Elasticsearch work together like it's 2023

  6. trafilatura

    Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

  7. parser

    12 Readability4J VS parser

    📜 Extract meaningful content from the chaos of a web page

  8. article-extractor

    To extract main article from given URL with Node.js

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. web-clipper

    For Notion,OneNote,Bear,Yuque,Joplin。Clip anything to anywhere

  11. go-readability

    Go package that cleans a HTML page for better readability.

  12. Crate

    CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.

  13. percollate

    15 Readability4J VS percollate

    A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.

  14. dragnet

    Just the facts -- web page content extraction

  15. go-domdistiller

    Go-DomDistiller is a Go port of the DOM Distiller library which implements Reader mode in Chrome for Android and Desktop. It has no dependencies on Chromium and is meant to run as a command line program or on a server.

  16. Just-Read

    5 Readability4J VS Just-Read

    A customizable read mode web extension.

  17. arc90-readability

    A copy of the original Arc90 repo with links to many of the current ports.

  18. dom-distiller

    Discontinued Distills the DOM

  19. knowledge

    A knowledge daemon to collect ideas and auto organize them, with SQLite (by daitangio)

  20. go-dateparser

    go parser for human readable dates ported from the dateparser python package

  21. article-extraction-benchmark

    Article extraction benchmark: dataset and evaluation scripts

  22. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better Readability4J alternative or higher similarity.

Readability4J discussion

Log in or Post with

Readability4J reviews and mentions

Posts with mentions or reviews of Readability4J. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-12.
  • Creating an advanced search engine with PostgreSQL
    9 projects | news.ycombinator.com | 12 Jul 2023
    Depending upon the type of content, one might want to look into using the Readability (Browder's reader view) to parse the webpage. It will give you all the useful info without the junk. Then you can put it in the DB as needed.

    https://github.com/mozilla/readability

    Btw, readability, is also available in few other languages like Kotlin:

    https://github.com/dankito/Readability4J

  • How does Firefox's Reader View work?
    15 projects | news.ycombinator.com | 30 Mar 2022
    My Hacker News client HACK for iOS and Android has a reader mode ability browser. While on iOS, I was able to use the reader mode feature provided by SFSafariViewController, that wasn't available on android.

    So I had to read a ton about this. I ended up using a heavily modified Kotlin version of Readability:

    https://github.com/dankito/Readability4J

    https://play.google.com/store/apps/details?id=com.pranapps.h...

    https://apps.apple.com/us/app/id1464477788

  • Show HN: Instantly Listen to Any URL
    3 projects | news.ycombinator.com | 13 Aug 2021
    Not sure about OP but I just implemented this in my Hacker News android client (thanks for the idea OP).

    This is how I implemented it. I had already achieved article to "reader mode" by heavily customizing the Kotlin port of Mozilla‘s Readability:

    https://github.com/dankito/Readability4J

    Then I pass the text via Android's TextToSpeech library and it works very well:

        fun trySpeaking(str:String){
  • A note from our sponsor - SaaSHub
    www.saashub.com | 23 Mar 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic Readability4J repo stats
3
149
4.3
over 3 years ago

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that HTML is
the 9th most popular programming language
based on number of references?