-
xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
tools
all-in collection of productivity scripts, CLI tools, utility libraries, fuse filesystems, and also some stuff (by bAndie91)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
> Well, jq is grep as well as sed and awk, but yeah, htmlq seems to be just grep, for sake of comparison.
Exactly, and that is what I mean. If you want to compare, compare it with grep, not jq.
Someone else posted xidel[0] in this thread, which I've not used, but it seems to be the "jq but for html".
[0] https://github.com/benibela/xidel
Once upon a time I was using pup[0] for such thing as well as later I changed to cascadia[1] which seemed much more advanced.
Comparing the two repos, it seems pup's development has somewhat died down.
These tools, including htmlq, seem to sell themselves as "jq for html", which is far from the truth. Jq is closer to the awk where you can do just about everything. Cascadia, htmlq, and pup seem closer to grep for html. They can essentially only select data from a html source.
[0] https://github.com/EricChiang/pup
This is very nice!
For reasoning about tree-based data such as HTML, I also highly recommend the declarative programming language Prolog. For instance, here is the sample query from the README, fetching all elements with id get-help from https://www.rust-lang.org, using Scryer Prolog and its SGML and HTTP libraries in combination with the XPath-inspired query language from library(xpath):
?- http_open("https://www.rust-lang.org", Stream, []),
It did write it a few years ago.
https://github.com/plainas/tq
['/', '/tools/install', '/learn', 'https://play.rust-lang.org/', '/tools', '/governance', '/community', 'https://blog.rust-lang.org/',...
is anyone else using the https://github.com/json-path/JsonPath over the jq route?
I hope we standardize on some jq query language, like we have with a base set of SQL syntax
https://jsoup.org/ has been around for a long time and seems a bit more mature & maintained than this two-code-files 2-year-old repo. Highly recommend.
I’d like to see a tool using lol-html [0] and their CSS selector API as a streaming HTML editor.
[0] https://github.com/cloudflare/lol-html
xmlstarlet is really nothing like jq, as a language. But yes, I use it because it is the best commandline xml processor I'd found. That's the only similarity to jq.
Is this the yq? https://kislyuk.github.io/yq/ It does contain an 'xq', as a literal wrapper for jq, piping output into it after transcoding XML to JSON using xmltodict https://github.com/martinblech/xmltodict (which explodes xml into separate JSON data structures).
This is a bash one-liner! But TBF it really is a 'jq for xml'. I think it would be horrible for some things, but you could also do a lot of useful things painlessly.
xmlstarlet is really nothing like jq, as a language. But yes, I use it because it is the best commandline xml processor I'd found. That's the only similarity to jq.
Is this the yq? https://kislyuk.github.io/yq/ It does contain an 'xq', as a literal wrapper for jq, piping output into it after transcoding XML to JSON using xmltodict https://github.com/martinblech/xmltodict (which explodes xml into separate JSON data structures).
This is a bash one-liner! But TBF it really is a 'jq for xml'. I think it would be horrible for some things, but you could also do a lot of useful things painlessly.
> Software definition through a reference to another software is somewhat confusing.
Possibly, depending on background as you note, but not all promotion is intended at the same audience. When submitting to HN, "like jq, but for X" is short and conveys what it is to most the people that would care, I think. jq has been submitted and talked about here many times with lively discussion over the years.[1] At this point I think most those that are interested in what that is and what this is will understand fairly quickly from the title. Those that don't might be missed, or they might look it up like you, or they might see it through some other submission some other time with a different title which isn't based on a chain of references.
1: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
parsel[0] is a python script in front of the identically named python lib, and extracts parts of the HTML by CSS selector. the advantage of it compared to most similar tools is that you can navigate in the DOM tree up and down to find precisely what you want if the HTML is poorly marked up, or the searched parts are not close to each other.
[0] https://github.com/bAndie91/tools/blob/master/usr/bin/parsel