JavaScript Readability

Open-source JavaScript projects categorized as Readability

Top 9 JavaScript Readability Projects

Readability
  1. percollate

    A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.

    Project mention: Show HN: Epublifier – scrape pages (books, manuals) for offline reading | news.ycombinator.com | 2024-10-21

    For those interested in a simple to use command line tool that accomplishes the same I've had success with percollate - https://github.com/danburzo/percollate

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. article-extractor

    To extract main article from given URL with Node.js

    Project mention: ScrapeGraphAI: Web scraping using LLM and direct graph logic | news.ycombinator.com | 2024-05-07

    Agreed!

    Apify's Website Content Crawler[0] does a decent job of this for most websites in my experience. It allows you to "extract" content via different built-in methods (e.g. Extractus [1]).

    We currently use this at Magic Loops[2] and it works _most_ of the time.

    The long-tail is difficult though, and it's not uncommon for users to back out to raw HTML, and then have our tool write some custom logic to parse the content they want from the scraped results (fun fact: before GPT-4 Turbo, the HTML page was often too large for the context window... and sometimes it still is!).

    Would love a dedicated tool for this. I know the folks at Reworkd[3] are working on something similar, but not sure how much is public yet.

    [0] https://apify.com/apify/website-content-crawler

    [1] https://github.com/extractus/article-extractor

    [2] https://magicloops.dev/

    [3] https://reworkd.ai/

  4. Just-Read

    A customizable read mode web extension.

  5. apca-w3

    The APCA version, to be licensed for use with guidelines: W3/AGWG.

  6. stutter

    RSVP for browsers (by jamestomasino)

  7. retext-readability

    plugin to check readability

  8. readability-extractor

    Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.

  9. line-length

    Measure lengths of text on a page

  10. validate-access

    Parse a & Validate a given directory with multiple entries

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

JavaScript Readability discussion

Log in or Post with

JavaScript Readability related posts

  • How do Instapaper and Pocket apps extract the content of the articles?

    1 project | /r/opensource | 4 Dec 2023
  • Share my down(load) function!

    1 project | /r/commandline | 22 May 2023
  • Reverse Engineering or Recreating the Chrome Extension?

    1 project | /r/RemarkableTablet | 21 Jan 2023
  • How do I enabled right click menu and developer console on a site that disabled it?

    1 project | /r/uBlockOrigin | 13 Oct 2022
  • software or browser extension to reformat text?

    1 project | /r/TBI | 11 Oct 2022
  • Reading web articles on the reMarkable

    1 project | /r/RemarkableTablet | 30 Aug 2022
  • Pa. commission proposes adding and increasing fees, axing gas tax to fund transportation needs.

    1 project | /r/pittsburgh | 14 Jul 2022
  • A note from our sponsor - SaaSHub
    www.saashub.com | 18 Jan 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Readability projects in JavaScript? This list will help you:

# Project Stars
1 percollate 4,350
2 article-extractor 1,632
3 Just-Read 1,216
4 apca-w3 159
5 stutter 140
6 retext-readability 94
7 readability-extractor 38
8 line-length 5
9 validate-access 3

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that JavaScript is
the 3rd most popular programming language
based on number of references?