browserless
pup
browserless | pup | |
---|---|---|
21 | 52 | |
7,920 | 8,000 | |
8.4% | - | |
9.8 | 0.0 | |
6 days ago | 6 days ago | |
TypeScript | HTML | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
browserless
-
How and why we ripped our Open Source product apart for a full rebuild
The core product is managed, cloud hosted browsers. We run thousands at a time using AWS and DigitalOcean, for people to use with Puppeteer and Playwright scripts. Our container is also available to self deploy under an open-source license.
-
Self-hosted browserless.io alternative ?
You should search for "Puppeteer as a service", there are some projects on github that you could deploy such as https://github.com/browserless/chrome
-
Remote Server Compromised
So I recently installed ChangeDetectioIO on my server, it requires either selenium/standalone-chrome-debug:3.141.59 or browserless/chrome. I installed it with Selenium in a docker container since I noticed that it was running better than the browserless/chrome service.
-
Angular docker base image
I had a look to this one: https://github.com/browserless/chrome ... but it is not suitable for builds, e.g. set to production mode, user permissions and so on.
- browserless chrome (Web browser automation built for everyone)
- Ask HN: What are the best tools for web scraping in 2022?
-
Using changedetection.io (installed via pip, not docker). How do I set up "WebDriver Chrome/Javascript"
git clone https://github.com/browserless/chrome /opt/browserless
- How to automate PDF generation of dashboards/web pages with open-source web automation
- Starring your repo does not give you permission to spam me
pup
-
script to download some notes
And lnk=$(curl -s https://www.selfstudys.com$url |grep "PDFFlip" | cut -d '"' -f 6) to lnk=$(curl -s https://www.selfstudys.com$url | pup "div#PDFF attr{source}" ) here pup will print content of source attribute from div tag with id PDFF i dont know that much about html & css so this is what i came up with. but i am sure you can also select class & make list of suburls from them. check out the video from bugswriter on pup or read docs from git hub for more info github link: https://github.com/ericchiang/pup
-
What monitoring tool do you use or recommend?
jq is pretty amazing. If you are comfortable with its jquery-like CSS selector syntax, then I should also mention a couple similar cli utilities that apply it to HTML: htmlp and pup.
-
Creating a data scraper as a beginner?
Regex is not a great tool for parsing web pages. Open up a browser dev tools window and select a bit of the page. Right click > copy... XPath expression or CSS selector. A proper web scraping tool will accept either of those. No muss, no fuss. You can even use simple command line tools: xpath or pup
- December 5, 2022: FLiP Stack Weekly
-
Show HN: A tool like jq, but for parsing HTML
This is HTML to JSON, written in Rust, and there's also pup[1] which I found out about just the other day on HN[2] which uses a very similar syntax (CSS selectors) but outputs HTML and is written in Go.
I can see room for both though it would interesting to have a more detailed comparison to go on (e.g. types of HTML, speed etc).
[1] https://github.com/ericchiang/pup
[2] https://news.ycombinator.com/item?id=33805732
- Pup: Parsing HTML at the command line
-
pup: Parsing HTML at the Command Line
It looks like the project became inactive for a bit and there are alternatives such as htmlq, etc. https://github.com/ericchiang/pup/issues/150
-
Converting field before delimiter to uppercase and how to replace with multiple newlines
Another tool worth mentioning is pup - it can produce JSON output which means you can pipe it to jq
What are some alternatives?
Dompdf - HTML to PDF converter for PHP
htmlq - Like jq, but for HTML.
PHP-Proxy - Proxy Application built on php-proxy library ready to be installed on your server
xidel - Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
Twitch-Drops-Bot - A Node.js bot that will automatically watch Twitch streams and claim drop rewards.
gron - Make JSON greppable!
browsershot - Convert HTML to an image, PDF or string
yq - Command-line YAML, XML, TOML processor - jq wrapper for YAML/XML/TOML documents
selenoid - Selenium Hub successor running browsers within containers. Scalable, immutable, self hosted Selenium-Grid on any platform with single binary.
cascadia - Go cascadia package command line CSS selector
FPDI - FPDI is a collection of PHP classes facilitating developers to read pages from existing PDF documents and use them as templates in FPDF.
ddgr - :duck: DuckDuckGo from the terminal