xpe
pup
Our great sponsors
xpe | pup | |
---|---|---|
8 | 52 | |
26 | 7,998 | |
- | - | |
0.0 | 0.0 | |
over 1 year ago | about 1 month ago | |
Python | HTML | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
xpe
- pup: Parsing HTML at the Command Line
- xpe: A commandline xpath parser that is easy to use.
-
What are some useful cli tools that arent popular?
xpe - a commandline xpath parser. xpaths are better than css queries for getting at specific html elements in the DOM. Compared to other parsers, this one is easier to use, and supports html.
-
Tell ONE terminal app you use everyday but no one seems know about the app
I use the heck out of xpe. It's a super simple command-line xpath parser using lxml in python.
-
htmlq - like jq, but for HTML
If you like xmllint, you might like xpe. It's more user-friendly.
-
What tools / utilities have you written that you use regularly?
xpe - a commandline xpath parser. I made this after trying to use xpaths for web automation in bash, and not finding anything that worked.
-
A list of command line tools for manipulating structured text data
For commandline xpath parsing for simple commandline web automation, xpe is pretty handy. It's a real simple python script, but it scratches that itch, and it's only a pip install away.
pup
-
script to download some notes
And lnk=$(curl -s https://www.selfstudys.com$url |grep "PDFFlip" | cut -d '"' -f 6) to lnk=$(curl -s https://www.selfstudys.com$url | pup "div#PDFF attr{source}" ) here pup will print content of source attribute from div tag with id PDFF i dont know that much about html & css so this is what i came up with. but i am sure you can also select class & make list of suburls from them. check out the video from bugswriter on pup or read docs from git hub for more info github link: https://github.com/ericchiang/pup
-
What monitoring tool do you use or recommend?
jq is pretty amazing. If you are comfortable with its jquery-like CSS selector syntax, then I should also mention a couple similar cli utilities that apply it to HTML: htmlp and pup.
-
Creating a data scraper as a beginner?
Regex is not a great tool for parsing web pages. Open up a browser dev tools window and select a bit of the page. Right click > copy... XPath expression or CSS selector. A proper web scraping tool will accept either of those. No muss, no fuss. You can even use simple command line tools: xpath or pup
- December 5, 2022: FLiP Stack Weekly
-
Show HN: A tool like jq, but for parsing HTML
This is HTML to JSON, written in Rust, and there's also pup[1] which I found out about just the other day on HN[2] which uses a very similar syntax (CSS selectors) but outputs HTML and is written in Go.
I can see room for both though it would interesting to have a more detailed comparison to go on (e.g. types of HTML, speed etc).
[1] https://github.com/ericchiang/pup
[2] https://news.ycombinator.com/item?id=33805732
- Pup: Parsing HTML at the command line
-
pup: Parsing HTML at the Command Line
It looks like the project became inactive for a bit and there are alternatives such as htmlq, etc. https://github.com/ericchiang/pup/issues/150
-
Converting field before delimiter to uppercase and how to replace with multiple newlines
Another tool worth mentioning is pup - it can produce JSON output which means you can pipe it to jq
What are some alternatives?
ProtonUpdater - Script to make it easier to update Proton GE to the latest version
htmlq - Like jq, but for HTML.
escaperoom - Command line utility to generate/host a fully functioning virtual escape room from a JSON config.
xidel - Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
focus - A fully featured productivity timer for the command line, based on the Pomodoro Technique. Supports Linux, Windows, and macOS.
gron - Make JSON greppable!
pandoc - Universal markup converter
yq - Command-line YAML, XML, TOML processor - jq wrapper for YAML/XML/TOML documents
lol-html - Low output latency streaming HTML parser/rewriter with CSS selector-based API
cascadia - Go cascadia package command line CSS selector
xdotool - fake keyboard/mouse input, window management, and more
ddgr - :duck: DuckDuckGo from the terminal