unclutter
parser
Our great sponsors
unclutter | parser | |
---|---|---|
39 | 11 | |
1,184 | 5,157 | |
1.2% | 1.3% | |
8.1 | 1.1 | |
about 1 month ago | 5 months ago | |
TypeScript | JavaScript | |
GNU Affero General Public License v3.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
unclutter
-
Show HN: Reader Mode, but Better
Another question: do you look at your saved links frequently, for example to browse by the tag you assigned? What's the primary purpose of tagging?
I just created a ticket for this: https://github.com/lindylearn/unclutter/issues/595
Hey thank you! I'm really glad you like the extension.
There actually is an "auto-activate" feature you can enable in the settings. Is this what you had in mind?
Regarding mobile support, I know. I'm not sure how to handle mobile Chrome (which doesn't allow extensions), but for Safari this should be possible. See https://github.com/lindylearn/unclutter/issues/529
I like it. Ought to be a built-in browser feature, in addition to generic reader mode. But then so should uBO-level ad blocking...so you may have many years.
Have you considered donations/sponsors/patrons? Presumably more an occasional coffee requires massive numbers of users for something like this, but maybe you could do something like prioritize attention to site-specific fixes (which seem to be the main thing in https://github.com/lindylearn/unclutter/issues) for supporters.
It is good enough for many people and readability.js is re-used in many other projects. I'm really grateful for it.
Unclutter just produces more visually pleasing results by keeping the original style of the website intact. Here's a side-by-side comparison: https://github.com/lindylearn/unclutter/blob/main/docs/compa...
It's possible for extensions to only get access when you activate it for a specific tab. If you want Unclutter to work this way you can manually set "Site access" to "On click" in the Chrome extension settings.
The reason I enabled "all sites" by default is to make the automatic activation feature work (which used to be more powerful). Possibly this can be done with an optional all-sites permission now, I'll look into it: https://github.com/lindylearn/unclutter/issues/527
Thanks for the feedback! Also, if you don't like something build the extension yourself or submit a PR ;)
I was also skeptical at first, but the code is open source, and there's clear documentation on privacy policy and metrics collected.
https://github.com/lindylearn/unclutter/blob/main/docs/metri...
Bravo to the author, well done for earning the users' trust.
Makes sense!
I've actually had a similar feature request before: https://github.com/lindylearn/unclutter/issues/297
Have you tried some of the existing solutions mentioned in the ticket? I'm curious if they solve the problem for you, and if not, what Unclutter could do better.
Not an edge case, I've been tracking this for a while: https://github.com/lindylearn/unclutter/issues/13
Someone else in this thread suggested a version for mobile Safari which made supporting this even more interesting. No promises, but hopefully I can get to this before the end of the year.
There already is crowdsourcing of broken page reports: https://github.com/lindylearn/unclutter/issues?q=is%3Aissue+...
And twitter.com is a special case: https://github.com/lindylearn/unclutter/issues/570
I'm working on those, but it's never going to be perfect unfortunately.
-
Unclutter — a browser extension to read & save articles
All of this is only possible through the feedback and Open-Source contributions from all of you! Here’s more info: unclutter.lindylearn.io
parser
-
Trouble Building Chrome Extension to Get News Article Content
I've been working on an enhanced reader mode extension for the last few months. I found that Mercury Reader's parser tool is useful for extracting content. If that's not exactly what you're looking for, readibility is another good option. It's a library used inside Firefox's reader moder that you can use in any project.
-
What Are The Coolest Virtual Machines You Currently Run 24/7?
I currently have it turned off while I search for better sources, but I have a VM that runs a custom cron script that combines a custom RSS reader, podfox, mercury-parser, and coqui-ai to generate audio podcasts from RSS news feeds. I should probably clean it up and release the script/setup process. With a few tweaks and some AI text-to-speech and a little machine learning audio processing you can get a really good podcast experience from text posts.
-
Which are some open-source Chrome extensions you want to use on Firefox?
https://github.com/postlight/mercury-parser The only one I need, shit's too good
-
API for getting news fulltext
An alternative would be to extract the plain text from the article's page with either some "readability" API or a library like Mercury Parser: https://github.com/postlight/mercury-parser
-
How does Firefox's Reader View work?
I haven’t directly compared them, but I have also found mercury parser (https://github.com/postlight/mercury-parser) to be very reliable.
Since it turns a website into very plain (X)HTML it‘s fairly easy to use it to make a browsing proxy or automatically produce epub files for e-readers, which is what I do.
-
Build your self-hosted Evernote
Make sure that at the end of the process you have the node and npm executables installed - the http.webpage integration uses the Mercury Parser API to convert web pages to Markdown.
-
Reading from the web offline and distraction-free
Good luck! Those HTML issues you're coming across are tough and so varied across the web!
I was working with Mercury Parser (pluggable parsing for different sites) in the past.
- The most underused browser feature
-
A Unix-style personal search engine and web crawler for your digital footprint
Sadly not - I'd love it to do that, but the Pocket API doesn't make that available.
I've been contemplating building an add-on for Dogsheep that can do this for any given URL (from Pocket or other sources) by shelling out to an archive script such as https://github.com/postlight/mercury-parser - I collected some suggestions for libraries to use here: https://twitter.com/simonw/status/1401656327869394945
That way you could save a URL using Pocket or browser bookmarks or Pinboard or anything else that I can extract saved URLs from an a separate script could then archive the full contents for you.
What are some alternatives?
readability - A standalone version of the readability lib
hn-search - Hacker News Search
Just-Read - A customizable read mode web extension.
FParsec - A parser combinator library for F#
tidy-html5 - The granddaddy of HTML tools, with support for modern standards
murder - Large scale server deploys using BitTorrent and the BitTornado library
rdrview - Firefox Reader View as a command line tool
arc90-readability - A copy of the original Arc90 repo with links to many of the current ports.
Shiori - Simple bookmark manager built with Go
termux-widget - Termux add-on app which adds shortcuts to commands on the home screen.
Camlistore - Perkeep (née Camlistore) is your personal storage system for life: a way of storing, syncing, sharing, modelling and backing up content.
percollate - A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.