hn-search
readability
hn-search | readability | |
---|---|---|
2,025 | 56 | |
555 | 9,795 | |
0.0% | 3.5% | |
2.9 | 7.5 | |
over 1 year ago | about 1 month ago | |
TypeScript | JavaScript | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hn-search
-
Privacy concerns mount as Dutch intelligence continues to share data with U.S.
Sharing data with US Gov increasingly means delivering that data into the hands of the Whitehouse's top ally - Peter Thiel/Palantir
ref: https://hn.algolia.com/?dateRange=pastMonth&page=0&prefix=fa...
- You Wouldn't Steal a Font
-
Pete Hegseth shared Yemen attack details in second Signal chat
https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...
- Nitrogen runoff leads to preventable health outcomes, experts say (2021)
-
Whistleblower: Doge Siphoned NLRB Case Data
Guys, I want to investigate this claim, but people keep making it without giving me any details to look into. If you give us a specific news item or date range, we can look at the data and see what was happening (we have access to internal and external tools that show where each story was ranked at different times).
Also: any time you know of an important story that you think should be on the front page, you can email us to let us know - [email protected]. We'll either address it or explain why we're doing something other than what you're asking for.
> Someone relying on HN as their primary news aggregator
We don't expect anyone to be doing this, unless they actively want to not know what is happening in the mainstream news. Hacker News is very explicitly meant to be for things other than the normal stories in the mainstream news. This has been in the guidelines forever:
On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.
Off-Topic: Most stories about politics, or crime, or sports, or celebrities, unless they're evidence of some interesting new phenomenon. Videos of pratfalls or disasters, or cute animal pictures. If they'd cover it on TV news, it's probably off-topic.
Even still, we have had huge numbers of heavily upvoted/discussed front-page stories about DOGE, which is clear from looking at this list:
https://hn.algolia.com/?dateEnd=1745410493&dateRange=custom&...
Again I say, if there's a story that you think should be in that list that wasn't, please let us know about it and we can investigate or explain.
-
Computational Complexity of Air Travel Planning [pdf]
This is a very popular article that get submitted every now and then (read nearly every year).
Past:
https://hn.algolia.com/?query=Computational%20Complexity%20o...
-
Pope Francis Has Died
Interestingly, [X has died] seems to be among some of the topmost upvoted posts of HN. (Based on https://hn.algolia.com/)
-
Silicon Valley crosswalk buttons apparently hacked to imitate Musk, Zuck voices
This HN post is seven days old but presented as if it was posted 9 hours ago. The déjà vu effect is disconcerting and an absolute mind fuck. Please stop doing this, ffs. The person who thought this would be a good idea is a madman.
https://hn.algolia.com/?query=Silicon%20Valley%20crosswalk%2...
-
College Towns: Urbanism from a Past Era with Ryan Allen
I really like the Barcelona Superblocks model [1], but paying people to move closer to schools is also an option imho.
[1] https://hn.algolia.com/?q=barcelona+superblocks
-
Judge holds Trump administration in criminal contempt over deportation flights
Political interest and intellectual interest are not the same thing. HN is for the latter.
There is overlap, of course [1], but there is also a huge amount of political and social material which is not primarily about intellectual curiosity. Most of that is off topic on HN (as the site guidelines say), even though much of it is far more important—as you say—than nearly anything else on HN.
[1] https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...
readability
-
Show HN: Hacker News with AI-generated summaries
Hey, thanks for checking it out!
Yeah, the webpage parser (https://github.com/mozilla/readability) doesn’t work well with social media sites like Twitter/X. It can summarize up to about 37,500 words, as long as the website is parsable.
I haven’t gotten around to adding pagination for the “More” pages yet. It currently only handles the home route/front page (https://news.ycombinator.com/). The cost of summaries does add up, especially on longer articles, so I wanted to limit expenses until I can find a way to keep it running while at least breaking even.
- Show HN: I built a tool that summarizes web articles with AI
-
Clean up HTML Content for Retrieval-Augmented Generation with Readability.js
Mozilla makes the underlying library for Firefox's reader mode available as a standalone open-source module: Readability.js. So we can use Readability.js in a data pipeline to strip irrelevant content and return high quality results from scraping a web page.
- Show HN: HTML-to-Markdown – convert entire websites to Markdown with Golang/CLI
-
2markdown – Transform Websites into Markdown
Why not just use something like https://github.com/mozilla/readability
And not pay $0.01 per request?
There’s a node version too https://www.npmjs.com/package/@mozilla/readability
- Mozilla: Readability.js
-
CSS for readability
I'm working with the Mozilla's readability library https://github.com/mozilla/readability to get the "readable" text from articles and now I want to style the extracted text in a readable way.
-
Building a Serverless Reader View with Lambda and Chrome
Do you remember the Firefox Reader View? It's a feature that removes all unnecessary components like buttons, menus, images, and so on, from a website, focusing on the readable content of the page. The library powering this feature is called Readability.js, which is open source.
-
Webrecorder: Capture interactive websites and replay them at a later time
I wonder if Firefox "reader mode as a utility" might be a viable alternative for Pinboard like "content oriented" archiving?
https://github.com/mozilla/readability
-
Creating an advanced search engine with PostgreSQL
Depending upon the type of content, one might want to look into using the Readability (Browder's reader view) to parse the webpage. It will give you all the useful info without the junk. Then you can put it in the DB as needed.
https://github.com/mozilla/readability
Btw, readability, is also available in few other languages like Kotlin:
https://github.com/dankito/Readability4J
What are some alternatives?
v - Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io
parser - 📜 Extract meaningful content from the chaos of a web page
duckduckgo-locales - Translation files for <a href="https://duckduckgo.com"> </a>
rssguard - Feed reader (podcast player and also Gemini protocol client) which supports RSS/ATOM/JSON and many web-based feed services.
fut - Fusion programming language. Transpiling to C, C++, C#, D, Java, JavaScript, Python, Swift, TypeScript and OpenCL C.
autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python