Our great sponsors
-
xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
> Well, jq is grep as well as sed and awk, but yeah, htmlq seems to be just grep, for sake of comparison.
Exactly, and that is what I mean. If you want to compare, compare it with grep, not jq.
Someone else posted xidel[0] in this thread, which I've not used, but it seems to be the "jq but for html".
-
Once upon a time I was using pup[0] for such thing as well as later I changed to cascadia[1] which seemed much more advanced.
Comparing the two repos, it seems pup's development has somewhat died down.
These tools, including htmlq, seem to sell themselves as "jq for html", which is far from the truth. Jq is closer to the awk where you can do just about everything. Cascadia, htmlq, and pup seem closer to grep for html. They can essentially only select data from a html source.
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
This is very nice!
For reasoning about tree-based data such as HTML, I also highly recommend the declarative programming language Prolog. For instance, here is the sample query from the README, fetching all elements with id get-help from https://www.rust-lang.org, using Scryer Prolog and its SGML and HTTP libraries in combination with the XPath-inspired query language from library(xpath):
?- http_open("https://www.rust-lang.org", Stream, []),
-
-
-
-
It did write it a few years ago.
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
['/', '/tools/install', '/learn', 'https://play.rust-lang.org/', '/tools', '/governance', '/community', 'https://blog.rust-lang.org/',...
-
is anyone else using the https://github.com/json-path/JsonPath over the jq route?
I hope we standardize on some jq query language, like we have with a base set of SQL syntax
-
https://jsoup.org/ has been around for a long time and seems a bit more mature & maintained than this two-code-files 2-year-old repo. Highly recommend.
-
I’d like to see a tool using lol-html [0] and their CSS selector API as a streaming HTML editor.
-
-
xmlstarlet is really nothing like jq, as a language. But yes, I use it because it is the best commandline xml processor I'd found. That's the only similarity to jq.
Is this the yq? https://kislyuk.github.io/yq/ It does contain an 'xq', as a literal wrapper for jq, piping output into it after transcoding XML to JSON using xmltodict https://github.com/martinblech/xmltodict (which explodes xml into separate JSON data structures).
This is a bash one-liner! But TBF it really is a 'jq for xml'. I think it would be horrible for some things, but you could also do a lot of useful things painlessly.
-
xmlstarlet is really nothing like jq, as a language. But yes, I use it because it is the best commandline xml processor I'd found. That's the only similarity to jq.
Is this the yq? https://kislyuk.github.io/yq/ It does contain an 'xq', as a literal wrapper for jq, piping output into it after transcoding XML to JSON using xmltodict https://github.com/martinblech/xmltodict (which explodes xml into separate JSON data structures).
This is a bash one-liner! But TBF it really is a 'jq for xml'. I think it would be horrible for some things, but you could also do a lot of useful things painlessly.
-
-
> Software definition through a reference to another software is somewhat confusing.
Possibly, depending on background as you note, but not all promotion is intended at the same audience. When submitting to HN, "like jq, but for X" is short and conveys what it is to most the people that would care, I think. jq has been submitted and talked about here many times with lively discussion over the years.[1] At this point I think most those that are interested in what that is and what this is will understand fairly quickly from the title. Those that don't might be missed, or they might look it up like you, or they might see it through some other submission some other time with a different title which isn't based on a chain of references.
1: https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
-
parsel[0] is a python script in front of the identically named python lib, and extracts parts of the HTML by CSS selector. the advantage of it compared to most similar tools is that you can navigate in the DOM tree up and down to find precisely what you want if the HTML is poorly marked up, or the searched parts are not close to each other.
[0] https://github.com/bAndie91/tools/blob/master/usr/bin/parsel
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives