Our great sponsors
-
reader
Discontinued get a reader mode sharable url for any url - built with cloudflare workers https://reader.tuananh.net (by tuananh)
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
ReadabiliPy
A simple HTML content extractor in Python. Can be run as a wrapper for Mozilla's Readability.js package or in pure-python mode.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I love this feature so much, I built a website to let me share clean text URL with others.
The first one use js lib but it's kinda limited. the second one use Go, compiled to wasm.
- https://github.com/tuananh/reader
I have used and love readability.js. I used it in an application that lets you run various NLP analyses over a web page (surprisals, reading time, word counts, etc.). For that, I needed only the main page content. readability.js retrieves main page content well, consistently.
The Alan Turing Institute maintains a Python wrapper around readability.js, too: https://github.com/alan-turing-institute/ReadabiliPy.
If you run it inside a container, it's fairly simple: https://github.com/phpdocker-io/readability-js-server
See also the C port here: https://github.com/eafer/rdrview/
It works well with text-mode browsers like w3m.
Clipper.js is built on top of Mozilla's Readability library, Turndown to convert HTML to Markdown https://github.com/philschmid/clipper.js