readability
NATS
readability | NATS | |
---|---|---|
52 | 106 | |
8,100 | 14,816 | |
3.7% | 1.1% | |
6.3 | 9.8 | |
13 days ago | about 17 hours ago | |
JavaScript | Go | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
readability
-
2markdown – Transform Websites into Markdown
Why not just use something like https://github.com/mozilla/readability
And not pay $0.01 per request?
There’s a node version too https://www.npmjs.com/package/@mozilla/readability
- Mozilla: Readability.js
-
CSS for readability
I'm working with the Mozilla's readability library https://github.com/mozilla/readability to get the "readable" text from articles and now I want to style the extracted text in a readable way.
-
Building a Serverless Reader View with Lambda and Chrome
Do you remember the Firefox Reader View? It's a feature that removes all unnecessary components like buttons, menus, images, and so on, from a website, focusing on the readable content of the page. The library powering this feature is called Readability.js, which is open source.
-
Webrecorder: Capture interactive websites and replay them at a later time
I wonder if Firefox "reader mode as a utility" might be a viable alternative for Pinboard like "content oriented" archiving?
https://github.com/mozilla/readability
-
Creating an advanced search engine with PostgreSQL
Depending upon the type of content, one might want to look into using the Readability (Browder's reader view) to parse the webpage. It will give you all the useful info without the junk. Then you can put it in the DB as needed.
https://github.com/mozilla/readability
Btw, readability, is also available in few other languages like Kotlin:
https://github.com/dankito/Readability4J
-
Seeking a tool or method to convert webpages into Q&A format using NLP
Use Mozilla's Readability to extract that sweet, sweet text content from webpages.
-
I built a free prompt managing tool - Knit
Same as above but the ability to grab the entire article text (you can use the Readability library for that: https://github.com/mozilla/readability)
-
I need automatic source URLs when I paste any text onto a card or note, like on OneNote.
// Original script // https://gist.github.com/kepano/90c05f162c37cf730abb8ff027987ca3 // Bookmarklet Converter // https://caiorss.github.io/bookmarklet-maker/ // Libraries // https://github.com/mixmark-io/turndown // https://github.com/mozilla/readability javascript: Promise.all([import('https://unpkg.com/[email protected]?module'), import('https://unpkg.com/@tehshrike/[email protected]'), ]).then(async ([{ default: Turndown }, { default: Readability }]) => { /* Optional vault name */ const vault = ""; /* Optional folder name such as "Clippings/" */ const folder = "Clippings/"; /* Optional tags */ const tags = ""; function getSelectionHtml() { var html = ""; if (typeof window.getSelection != "undefined") { var sel = window.getSelection(); if (sel.rangeCount) { var container = document.createElement("div"); for (var i = 0, len = sel.rangeCount; i < len; ++i) { container.appendChild(sel.getRangeAt(i).cloneContents()); } html = container.innerHTML; } } else if (typeof document.selection != "undefined") { if (document.selection.type == "Text") { html = document.selection.createRange().htmlText; } } return html; } const selection = getSelectionHtml(); const { title, byline, content } = new Readability(document.cloneNode(true)).parse(); function getFileName(fileName) { var userAgent = window.navigator.userAgent, platform = window.navigator.platform, windowsPlatforms = ['Win32', 'Win64', 'Windows', 'WinCE']; if (windowsPlatforms.indexOf(platform) !== -1) { fileName = fileName.replace(':', '').replace(/[/\\?%*|"<>]/g, '-'); } else { fileName = fileName.replace(':', '').replace(/\//g, '-').replace(/\\/g, '-'); } return fileName; } const fileName = getFileName(title); if (selection) { var markdownify = selection; } else { var markdownify = content; } if (vault) { var vaultName = '&vault=' + encodeURIComponent(`${vault}`); } else { var vaultName = ''; } const markdownBody = new Turndown({ headingStyle: 'atx', hr: '---', bulletListMarker: '-', codeBlockStyle: 'fenced', emDelimiter: '*', }).turndown(markdownify); var date = new Date(); function convertDate(date) { var yyyy = date.getFullYear().toString(); var mm = (date.getMonth()+1).toString(); var dd = date.getDate().toString(); var mmChars = mm.split(''); var ddChars = dd.split(''); return yyyy + '-' + (mmChars[1]?mm:"0"+mmChars[0]) + '-' + (ddChars[1]?dd:"0"+ddChars[0]); } const today = convertDate(date); // This is the output template // It is similar to an Obsidian core template // except to insert a value we use: ${value} instead of {{value}} const fileContent =`--- type: clipping date_added: ${today} aliases: [] tags: [${tags}] --- author:: ${byline.toString().split('\n')[0].trim()} source:: [${title}](${document.URL}) ${markdownBody} `; // This copies your text to the clipboard navigator.clipboard.writeText(fileContent); // This creates a new document in Obsidian containing your clipping // I commented it out as this isn't what you asked for /* document.location.href = "obsidian://new?" + "file=" + encodeURIComponent(folder + fileName) + "&content=" + encodeURIComponent(fileContent) + vaultName; */ })
- Any js packages to only scrape relevant content from a webpage?
NATS
-
Implementing OTel Trace Context Propagation Through Message Brokers with Go
Several message brokers, such as NATS and database queues, are not supported by OpenTelemetry (OTel) SDKs. This article will guide you on how to use context propagation explicitly with these message queues.
-
NATS: First Impressions
https://nats.io/ (Tracker removed)
> Connective Technology for Adaptive Edge & Distributed Systems
> An Introduction to NATS - The first screencast
I guess I don't need to know what it is
-
Interview with Sebastian Holstein, Founder of Qaze
During our interview, we referred to NATS quite a few times! If you want to learn more about it, Sebastian suggests this tutorial series.
-
Sequential and parallel execution of long-running shell commands
Pueue dumps the state of the queue to the disk as JSON every time the state changes, so when you have a lot of queued jobs this results in considerable disk io. I actually changed it to compress the state file via zstd which helped quite a bit but then eventually just moved on to running NATS [1] locally.
[1] https://nats.io/
-
Revolutionizing Real-Time Alerts with AI, NATs and Streamlit
Imagine you have an AI-powered personal alerting chat assistant that interacts using up-to-date data. Whether it's a big move in the stock market that affects your investments, any significant change on your shared SharePoint documents, or discounts on Amazon you were waiting for, the application is designed to keep you informed and alert you about any significant changes based on the criteria you set in advance using your natural language. In this post, we will learn how to build a full-stack event-driven weather alert chat application in Python using pretty cool tools: Streamlit, NATS, and OpenAI. The app can collect real-time weather information, understand your criteria for alerts using AI, and deliver these alerts to the user interface.
-
New scalable, fault-tolerant, and efficient open-source MQTT broker
Why wasn't NATS[1] used ?
Written in Go, single-binary deployment... there's a lot to love about NATS !
[1]https://nats.io/
-
Scripting with NATS.io support
require nats.io
-
Introducing “Database Performance at Scale”: A Free, Open Source Book
About cost, see [1]. Also, S3 prices have been increasing and there's been a bunch of alternative offers for object store from other companies. I think people in here (HN) comment often about increasing costs of AWS offerings.
Distributed systems and consensus are inherently hard problem, but there are a lot of implementations that you can study (like Etcd that you mention, or NATS [2], which I've been playing with and looks super cool so far :-p) if you want to understand the internals, on top of many books and papers released.
Again, I never said it was "easy" to build distributed systems, I just don't think there's any esoteric knowledge to what S3 provides.
--
1: https://en.wikipedia.org/wiki/Economies_of_scale
2: https://nats.io/
- NATS: Connective Technology for Adaptive Edge and Distributed Systems
-
Is it an antipattern to use the response channel as identifier
I am in a project were nats.io is used. Someone thought, it would be a great idea to link data in an event with data in a response using the response channel name.
What are some alternatives?
parser - 📜 Extract meaningful content from the chaos of a web page
RabbitMQ - Open source RabbitMQ: core server and tier 1 (built-in) plugins
koreader - An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
celery - Distributed Task Queue (development branch)
hn-search - Hacker News Search
redpanda - Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
readability.php - PHP port of Mozilla's Readability.js
ZeroMQ - ZeroMQ core engine in C++, implements ZMTP/3.1
rssguard - Feed reader (and podcast player) which supports RSS/ATOM/JSON and many web-based feed services.
Apache ActiveMQ - Mirror of Apache ActiveMQ
SponsorBlock - Skip YouTube video sponsors (browser extension)
nsq - A realtime distributed messaging platform