parser VS hn-search

Compare parser vs hn-search and see what are their differences.

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
parser hn-search
12 1,619
5,245 524
2.2% 1.5%
1.1 2.9
6 months ago 6 months ago
JavaScript TypeScript
Apache License 2.0 GNU General Public License v3.0 or later
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

parser

Posts with mentions or reviews of parser. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-14.
  • Show HN: I made a tool to clean and convert any webpage to Markdown
    17 projects | news.ycombinator.com | 14 Apr 2024
    Thoroughly scraping is challenging, especially in an environment where you don’t have (or want) a JavaScript runtime.

    For content extraction, I found the approach the Postlight library takes quite neat. It scores individual html nodes based on some heuristics (text length, link density, css classes). It the selects the nodes with the highest score. [1] I ported it to Swift for a personal read later app.

    [1] https://github.com/postlight/parser

  • Trouble Building Chrome Extension to Get News Article Content
    3 projects | /r/webdev | 22 Nov 2022
    I've been working on an enhanced reader mode extension for the last few months. I found that Mercury Reader's parser tool is useful for extracting content. If that's not exactly what you're looking for, readibility is another good option. It's a library used inside Firefox's reader moder that you can use in any project.
  • What Are The Coolest Virtual Machines You Currently Run 24/7?
    10 projects | /r/selfhosted | 10 Oct 2022
    I currently have it turned off while I search for better sources, but I have a VM that runs a custom cron script that combines a custom RSS reader, podfox, mercury-parser, and coqui-ai to generate audio podcasts from RSS news feeds. I should probably clean it up and release the script/setup process. With a few tweaks and some AI text-to-speech and a little machine learning audio processing you can get a really good podcast experience from text posts.
  • Extracting Text button no longer works
    1 project | /r/RelayForReddit | 22 Sep 2022
    It looks like Relay could be updated to convert it locally though, since the parser that it uses appears to be open source.
  • Which are some open-source Chrome extensions you want to use on Firefox?
    7 projects | /r/firefox | 16 Apr 2022
    https://github.com/postlight/mercury-parser The only one I need, shit's too good
  • API for getting news fulltext
    2 projects | /r/api | 8 Apr 2022
    An alternative would be to extract the plain text from the article's page with either some "readability" API or a library like Mercury Parser: https://github.com/postlight/mercury-parser
  • How does Firefox's Reader View work?
    15 projects | news.ycombinator.com | 30 Mar 2022
    I haven’t directly compared them, but I have also found mercury parser (https://github.com/postlight/mercury-parser) to be very reliable.

    Since it turns a website into very plain (X)HTML it‘s fairly easy to use it to make a browsing proxy or automatically produce epub files for e-readers, which is what I do.

  • Build your self-hosted Evernote
    12 projects | dev.to | 6 Jan 2022
    Make sure that at the end of the process you have the node and npm executables installed - the http.webpage integration uses the Mercury Parser API to convert web pages to Markdown.
  • Reading from the web offline and distraction-free
    7 projects | news.ycombinator.com | 10 Oct 2021
    Good luck! Those HTML issues you're coming across are tough and so varied across the web!

    I was working with Mercury Parser (pluggable parsing for different sites) in the past.

    https://github.com/postlight/mercury-parser

  • The most underused browser feature
    22 projects | news.ycombinator.com | 25 Aug 2021

hn-search

Posts with mentions or reviews of hn-search. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-23.
  • Gary Killdall, creator of CP/M, wrote Pixar's original 3D renderer [pdf]
    1 project | news.ycombinator.com | 27 Apr 2024
    The submitted title was "Gary Killdall, creator of CP/M, wrote Pixar's original 3D renderer".

    Submitters: If you want to say what you think is important about an article, that's fine, but do it by adding a comment to the thread. Then your view will be on a level playing field with everyone else's: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...

    (From https://news.ycombinator.com/newsguidelines.html: "Please use the original title, unless it is misleading or linkbait; don't editorialize.")

  • Nearsightedness is at epidemic levels – and the problem begins in childhood
    1 project | news.ycombinator.com | 24 Apr 2024
    Vision therapy for myopia helps some people, but not everyone, likely due to genetic and neuroplasticity differences, https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu.... Nevertheless, many of the principles are useful for children whose eyes and brains are still developing.
  • Tesla driver arrested for homicide after running over motorcyclist on Autopilot
    1 project | news.ycombinator.com | 24 Apr 2024
    I'm a huge Tesla skeptic, but Tesla and Musk are lightning rods for tabloid-style garbage that doesn't belong on HN, so it doesn't surprise me that we often see negative Tesla content flagged to death. Meanwhile we also see plenty of content that hits the front page and stays there [0].

    Do you have examples of professional, interesting Tesla content that got flagged?

    [0] More than half of the past year's most popular Tesla articles were negative: https://hn.algolia.com/?dateRange=pastYear&page=0&prefix=tru...

  • The Man Who Killed Google Search
    3 projects | news.ycombinator.com | 23 Apr 2024
    It's April 23rd, 2024, and I am still looking for a good, reliable, honest and simple search engine.

    All I want to do is search.

    No AI.

    No ads.

    No shopping.

    Please don't "Answer my question." I enjoy doing my own original research, thanks.

    I'm entirely willing - wanting even - to pay for it.

    Currently Kagi has my $, but I'm saddened and frustrated that they're not even focused on Search, they're focused on AI[1] and t-shirts.

    Amazingly, in 2024, there is still a market opportunity for a good search engine.

    It can't really just be me, can it?

    [1]: https://hn.algolia.com/?query=%22kagi%22+%22ai%22

  • Ask HN: Is Hacker News under attack from spam bots?
    1 project | news.ycombinator.com | 22 Apr 2024
    https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

    For historical purposes

  • Tesla Recalls All Cybertrucks for Faulty Accelerator Pedals
    1 project | news.ycombinator.com | 21 Apr 2024
    Most likely because there have been oodles of low-quality stories on these topics. We turned the flags off on this one since it maybe rises above the noise (see https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so... for past explanations on how we approach that).
  • Show HN: What Are You Working On?
    6 projects | news.ycombinator.com | 21 Apr 2024
    Hey HN,

    I'm sure you've seen the monthly "Ask HN: What Are You Working On?" headlines on [Hacker News](https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...).

    Honestly, it's my favorite topic because it's packed with insights about what other hackers are up to.

    I wondered what it would be like if instead of just a headline, there was a whole website where hackers could post daily updates, and where we could follow the hackers we're interested in for their latest updates. And so, this web site was born.

    I hope it gets used frequently so we can all benefit from it together. I look forward to hearing your thoughts.

    Let me know what you think!

  • Not Apply to YC
    1 project | news.ycombinator.com | 20 Apr 2024
    I don't know what one thing you're referring to, but it's a core principle of HN to try to avoid repetition, and especially the repetition+indignation combo, which is the commonest and most tedious thing on the internet.

    https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...

  • Nand to Tetris: Building a Modern Computer System from First Principles
    1 project | news.ycombinator.com | 19 Apr 2024
    Happy 10,000 day to you

    https://news.ycombinator.com/from?site=nand2tetris.org

    https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

    https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

  • Moxie: I'm no longer involved at Signal
    1 project | news.ycombinator.com | 19 Apr 2024
    not sure. I searched comments: https://hn.algolia.com/?dateRange=pastYear&page=0&prefix=fal...

    Most recent are more culture wars stuff but some earlier ones appear to suggesting a degree of alignment with the USA government.

What are some alternatives?

When comparing parser and hn-search you can also consider the following projects:

readability - A standalone version of the readability lib

duckduckgo-locales - Translation files for <a href="https://duckduckgo.com"> </a>

Just-Read - A customizable read mode web extension.

v - Simple, fast, safe, compiled language for developing maintainable software. Compiles itself in <1s with zero library dependencies. Supports automatic C => V translation. https://vlang.io

FParsec - A parser combinator library for F#

tidy-html5 - The granddaddy of HTML tools, with support for modern standards

yq - Command-line YAML, XML, TOML processor - jq wrapper for YAML/XML/TOML documents

rdrview - Firefox Reader View as a command line tool

milkdown - 🍼 Plugin driven WYSIWYG markdown editor framework.

arc90-readability - A copy of the original Arc90 repo with links to many of the current ports.

nitter - Alternative Twitter front-end