lol-html vs yq

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

lol-html		yq
	Project
8	Mentions	24
1,390	Stars	2,461
1.9%	Growth	-
5.7	Activity	6.1
about 1 month ago	Latest Commit	10 days ago
Rust	Language	Python
BSD 3-clause "New" or "Revised" License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

lol-html

Posts with mentions or reviews of lol-html. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-02-23.

Ask HN: A fast, Rust HTML parser that works?
4 projects | news.ycombinator.com | 23 Feb 2023

So I'm doing some web scraping in Rust, and so I will need to parse HTML. [scraper](https://docs.rs/scraper/latest/scraper/) (which uses [html5ever](https://github.com/servo/html5ever)) is doing fine except that it's the bottleneck of my application.
So I need a faster parser. I've tried [tl](https://docs.rs/tl/latest/tl/) which would've been perfect except that it doesn't actually work on the HTML I have. When I try to `query_selector` the elements I need, it returns nothing.
[Kuchiki](https://docs.rs/kuchiki/latest/kuchiki/) is abandonded.
I couldn't figure out how to get [lol-html](https://github.com/cloudflare/lol-html) to work for me (it's designed for re-writing HTML, whatever that means). It doesn't seem to have an API to extract the inner text of an element.
[html5gum](https://github.com/untitaker/html5gum) seems to be just an HTML tokenizer, or otherwise just too low-level. I have not yet tried [quick-xml](https://github.com/tafia/quick-xml/) but judging from the README, it's pretty low-level too. I mean, if these are the only options left then I will try them. Otherwise, I would love to use a parser that's faster but as ergonomic as `scraper` or `tl`.
At this point, I would be happy with an Lxml bridge/port of some sort. I don't need to mutate HTML, just parse and read data from it.
How much Rust work is actually going on at Cloudflare?
2 projects | /r/rust | 15 Jan 2023

I'm also in the Workers org but I have had a bit of interaction with Rust. There's some Rust in the Workers runtime using lol-html for HTMLRewriter as well as some tooling and there's the full blown workers-rs framework that I work on, but that's about it for the Rust I work on regularly.
Is there a library for manipulating HTML?
3 projects | /r/rust | 17 Dec 2022
pup: Parsing HTML at the Command Line
7 projects | news.ycombinator.com | 30 Nov 2022
Texting Robots: Taming robots.txt with Rust and 34 million tests
4 projects | /r/rust | 28 Mar 2022

Thanks again and happy to answer any questions! My current unreleased Rust projects include a web crawler that uses Tokio + Tokio Console + Reqwest with this crate for robots.txt and a fast text extraction library using lol-html that I am planning to sprinkle with some minimal ML to get Readability.js style intelligent extraction (with training in Python). See Fathom for an example of the ML approach I'll likely take.
Like JQ, but for HTML
21 projects | news.ycombinator.com | 7 Sep 2021

I’d like to see a tool using lol-html [0] and their CSS selector API as a streaming HTML editor.
[0] https://github.com/cloudflare/lol-html
Things you can’t do in Rust (and what to do instead)
6 projects | news.ycombinator.com | 15 May 2021
Problems with building a backend app in Rust in 2020
2 projects | /r/rust | 21 Dec 2020

Cloudflare has open sourced lol-html, a "Low output latency streaming HTML parser/rewriter with CSS selector-based API". Is that what you are looking for?

yq

Posts with mentions or reviews of yq. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-29.

Jaq – A jq clone focused on correctness, speed, and simplicity
28 projects | news.ycombinator.com | 29 Nov 2023
jq 1.7 Released
33 projects | news.ycombinator.com | 6 Sep 2023
Using XPath in 2023
8 projects | news.ycombinator.com | 16 Jul 2023
How to troubleshoot yaml parsing error "did not find expected key"?
2 projects | /r/kubernetes | 6 May 2023

Install jq and yq, and wrap your commands with | yq -y ..
Memes are all cool and all. But this is your daily remaining that 10000! =
4 projects | /r/mathmemes | 23 Apr 2023

Confusingly there is another project called yq that does exactly what you're suggesting and it's a preprocessor that converts yaml to json and then used jq. https://github.com/kislyuk/yq
inhumane and error-prone
3 projects | /r/kubernetes | 21 Apr 2023

yq
Yq is a portable yq: command-line YAML, JSON, XML, CSV and properties processor
11 projects | news.ycombinator.com | 4 Feb 2023

I personally find the yq tool from https://github.com/kislyuk/yq much more useful: it has all the same options and formats as `jq` (as it's really a wrapper around jq). Rather than the `yq` in the OP here where only partial functionality exists.
The YAML Document from Hell
19 projects | news.ycombinator.com | 12 Jan 2023
Scraping weather info
1 project | /r/bash | 24 Nov 2022

XML data from the API can be parsed and filtered with xq. There may be multiple ways to get it; first try the yq toolset which includes it.
Show HN: Xq – command-line XML and HTML beautifier and content extractor
7 projects | news.ycombinator.com | 12 Nov 2022

There is also yq [1], which attempts the same for yaml, toml and xml. (And confusingly also contains a binary named "xq" for querying xml, however with a different syntax)
[1] https://github.com/kislyuk/yq

What are some alternatives?

When comparing lol-html and yq you can also consider the following projects:

actor-rust-scraper - Experimental scraper in Rust suited for running locally or on the Apify platform. Inspired by Apify SDK.

jq - Command-line JSON processor [Moved to: https://github.com/jqlang/jq]

tq - Perform a lookup by CSS selector on an HTML input

yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor

tools - all-in collection of productivity scripts, CLI tools, utility libraries, fuse filesystems, and also some stuff

jq - Command-line JSON processor

hq - lightweight command line HTML processor using CSS and XPath selectors

dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

cargo-expand - Subcommand to show result of macro expansion

xmlq - filter xml in the command line with xpath

xidel - Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

hn-search - Hacker News Search

lol-html vs actor-rust-scraper yq vs jq lol-html vs tq yq vs yq lol-html vs tools yq vs jq lol-html vs hq yq vs dasel lol-html vs cargo-expand yq vs xmlq lol-html vs xidel yq vs hn-search

Compare lol-html vs yq and see what are their differences.

lol-html

yq

lol-html

yq

What are some alternatives?