readability

A standalone version of the readability lib (by mozilla)

Readability Alternatives

Similar projects and alternatives to readability

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better readability alternative or higher similarity.

Suggest an alternative to readability

Reviews and mentions

Posts with mentions or reviews of readability. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-08-25.
  • The most underused browser feature
    news.ycombinator.com | 2021-08-25
    Any developers who'd like to contribute to improving how article content is extracted from web pages should check out Mozilla's Readability repository: https://github.com/mozilla/readability

    I'm currently trying to bring the PHP port up to speed here: https://github.com/fivefilters/readability.php

    We use currently use an older version as part of our article extraction for Push to Kindle: https://www.fivefilters.org/push-to-kindle/

    news.ycombinator.com | 2021-08-25
    I gotta go back and figure out if I can submit a PR to fix footnotes, though [1]. I don't like using Reader Mode when I know there might be stuff missing, so I itch to go and check.

    [1]: https://github.com/mozilla/readability/issues/654

    news.ycombinator.com | 2021-08-25
    I've just poked through both the GetPocket site (https://getpocket.com/publisher/) and Mozilla's Readability Library GitHub page (https://github.com/mozilla/readability) without seeing obvious guidelines.

    My general suspicion is that adhering to a simple HTML5 documemnt structure, and possible use of microformats (https://microformats.io/) goes a long way.

    news.ycombinator.com | 2021-08-25
  • After 3 years mozilla still did not open source pocket
    news.ycombinator.com | 2021-08-20
    I think this[0] comes close to what is used to extract text from an HTML document. Fetching can be done via any HTTP client. Will need jsdom to convert the text to DOM before feeding it to readability.

    [0]: https://github.com/mozilla/readability

  • A Unix-style personal search engine and web crawler for your digital footprint
    news.ycombinator.com | 2021-07-26
    Looks very much like one of the ideas I've been thinking of building! The way I planned to do it was to use a similar approach to rga for files ( https://github.com/phiresky/ripgrep-all ) and having a webextension to pull all webpages I vist (filtered via something like https://github.com/mozilla/readability ), dump that into either sqlite with FTS5 or postgres with FTS for search.

    A good search engine for "my stuff" and "stuff I've seen before" is not available for most people in my experience.

    ---

    Two things I'd mention are:

    1. Digital footprint usually means your info on other sites, not just things I've accessed. If I read a blog that is not part of my footprint, but if I leave a comment on that blog that comment is part of it. The term is also mostly used in a tracking and negative context (although there are exceptions), so you might want to change that: https://en.wikipedia.org/wiki/Digital_footprint

    2. I don't really get what makes it UNIX-style (or what exactly you mean by that? There seems to be many definitions), and the readme does not seem to clarify much besides expecting me to notice it by myself.

  • Article Parser - Parse News, Blog any kind of articles | RapidAPI
    - Feature Request (FR): I really like the output of mozillas readability, cause they also provide a normalized markup version of an article, you can just dump into the browser and it looks ok, which might be interesting for you to add,
  • A 4 minute introduction to RSS
    news.ycombinator.com | 2021-07-02
    If you're trying to build one yourself, have a look at the open source Readability code[1]. It was originally developed by Arc90 and is now used by Apple and Mozilla in their browser reader views. The code has been ported to a number of different languages.

    I work on a service called Full-Text RSS[2] that used a PHP port of Readability, coupled with site-specific extraction rules[3] to identify and extract article content from each feed item. It then produces a full-text version of the given feed. The idea is you subscribe to the full-text version in whichever feed reader you use and it will transparently give you full-text articles where you had partial content before.

    [1] https://github.com/mozilla/readability

    [2] https://www.fivefilters.org/full-text-rss/

    [3] https://github.com/fivefilters/ftr-site-config

  • Where do I find the full list of sites supported by Speedreader?
    It doesn't use a pre-defined list, it uses Mozilla's Readability library.
  • [Feature Request] Enable Firefox Reader View for text posts
    However there is a standalone library here: https://github.com/mozilla/readability
  • How to create books from websites?
    reddit.com/r/kindle | 2021-05-05
    Papeer uses RSS feeds to get the links to the articles. It then grabs the content directly on the HTML page with Mozilla Readability The whole content should always be here.
  • paperoni: An article downloader written in Rust
    reddit.com/r/rust | 2021-04-29
    Hello r/rust, I made a new minor release of a project of mine called Paperoni. As pointed out in the title it downloads web articles and saves them as EPUBs. The articles are extracted using the mozilla readability algorithm. I wrote a port of it to Rust and is used within the project. Hopefully you enjoy using the tool. Feel free to open an issue if you find a bug or would like to make a suggestion. You can read about my future plans of the project here. Have a good weekend.
  • Zenreader: A 4.7 Inches E-Ink RSS Reader Powered by ESP32
    news.ycombinator.com | 2021-04-19
    https://github.com/mozilla/readability

    I had no idea that Readability.js was available as a standalone library. That’s awesome!

  • Contrast Rebellion – to hell with unreadable, low-contrast texts
    news.ycombinator.com | 2021-04-08
    You actually can't; it is purely heuristic, and (afaik intentionally) has no API facing page scripts or content. It just examines page content and, if it finds "enough" text, offers reader mode.

    Here's how Firefox's reader mode implementation does it: https://github.com/mozilla/readability/blob/master/Readabili...

  • Firefox reader mode showing a small x instead of lyrics, intentional from that website's devs ?
    reddit.com/r/firefox | 2021-04-05
    Thanks for helping make Firefox better. I filed a bug for this.

Stats

Basic readability repo stats
18
4,381
5.4
about 1 month ago
Find remote JavaScript jobs at our new job board 99remotejobs.com. There are 15 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.