puppeteer-cluster
puppeteer
Our great sponsors
puppeteer-cluster | puppeteer | |
---|---|---|
3 | 203 | |
2,459 | 78,589 | |
- | 1.3% | |
7.1 | 9.8 | |
24 days ago | 3 days ago | |
TypeScript | TypeScript | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
puppeteer-cluster
-
Building Unlighthouse: Open-Source Package For Site-wide Google Lighthouse scans
To make Unlighthouse fast, we combined this lighthouse binary with the package puppeteer-cluster, which allows for multi-threaded lighthouse scans.
-
Redis with puppeteer for web scraping
https://github.com/thomasdondorf/puppeteer-cluster is doing queue-like mechanism but it is not using redis.
-
How to reuse puppeteer browser?
I've used this library: Puppeteer cluster
puppeteer
-
AMA I'm the founder of a 5 year tech startup, and we just recently reach $1M ARR
Literally just went to a Github Project (puppeteer in our case) and sorted issues by most comments/reactions. The number one was "How to run on Ubunutu" or something similar. Actually, it's this issue right here: https://github.com/puppeteer/puppeteer/issues/290.
-
How to workaround RAM-leaking libraries like Puppeteer
Motivated by this issue https://github.com/puppeteer/puppeteer/issues/5893
-
How to fix RAM-leaking libraries like Puppeteer easily. Universal way to fix RAM leaks once and forever
It is well-known, old like a crap issue, I took the link from GitHub issue page https://github.com/puppeteer/puppeteer/issues/5893
-
How to Scrape eBay Organic Results with Node.js
First, we need to create a Node.js* project and add npm packages puppeteer, puppeteer-extra and puppeteer-extra-plugin-stealth to control Chromium (or Chrome, or Firefox, but now we work only with Chromium which is used by default) over the DevTools Protocol in headless or non-headless mode.
-
Show HN: Browser extension that spoofs your location data to match your VPN
You can load extensions in puppeteer, also in headless with the experimental chrome mode. https://github.com/puppeteer/puppeteer/blob/main/docs/api.md...
-
Is there a way to scrape a website's inventory pages to have them populate on my site?
It's really not that hard. I've used Puppeteer (https://www.npmjs.com/package/puppeteer) in the past for similar projects. If you don't have any programming background, search for third party scraping tools with a built-in rotating proxy service - there are a few about, just give it a Google.
-
How We Automated our End-to-End Testing from the First Line of Code
After some research and understanding of what we need to design and build, we chose Docker Compose, RPyC, and Puppeteer for our test automation suite. The tech stack for our E2E testing is completely independent of the stack we chose for our product of course (we’ll dive into this in a separate post), and it can run with any product that is able to run Docker and Docker Compose in their environment.
-
How can I get browser automation solely in backend?
I'd recommend using Puppeteer, which can drive Chrome in headful or headless modes. In dev mode, you can spin up a headful browser so that you can watch what your script is doing and debug it. In production, you'll just flip the condition to headless.
-
Dynamic Open Graph Image Generator with Layer0, Next.js, TailwindCSS, Chrome AWS Lambda and Puppeteer-Core
// File: pages/api/index.js // This is accessible from the deployed link (say, Y.com) as y.com/api?queryparametershere import core from 'puppeteer-core' import chromium from 'chrome-aws-lambda' export default async function handler(req, res) { // Only allow POST to the given route if (req.method === 'GET') { const { title, mode, image, width = 1400, height = 720 } = req.query // Launching chrome with puppeteer-core // https://github.com/puppeteer/puppeteer/issues/3543#issuecomment-438835878 const browser = await core.launch({ args: chromium.args, defaultViewport: chromium.defaultViewport, executablePath: await chromium.executablePath, headless: chromium.headless, ignoreHTTPSErrors: true, }) // Create a page const page = await browser.newPage() // Define the dimensions of the page await page.setViewport({ width: parseInt(width), height: parseInt(height) }) // Load the /dynamic_blogs with the given query paramters // Don't forget to encode them! // req.headers.host allows to obtain the deployed link as is, hence this app can be deployed anywhere // This allows us to take advantage of Layer0 caching to serve the /dynamic_blogs pages faster to this .goto() call await page.goto(`https://${req.headers.host}/dynamic_blogs?title=${encodeURIComponent(title)}&image=${encodeURIComponent(image)}&mode=${encodeURIComponent(mode)}`) // On average, place an image that is fast to load. // Falling back to 5 seconds timeout where image might take longer to load. await page.waitForTimeout(5000) // Take screenshot of the body of the page, that is the content const content = await page.$('body') const imageBuffer = await content.screenshot({ omitBackground: true }) await page.close() await browser.close() res.setHeader('Cache-Control', 'public, immutable, no-transform, s-maxage=31536000, max-age=31536000') res.setHeader('Content-Type', 'image/png') res.send(imageBuffer) res.status(200) return } // Any other method than GET results in a ERROR 400. res.status(400).json({ message: 'Invalid method.' }) return }
-
Scraping Amazon using Puppeteer and Browserless
Here's the kicker, all we are going to use is one script and under 50 lines of code. How are we going to do this you may ask? The magic of JavaScript. Furthermore, a node library called Puppeteer and a web service called Browserless. If you don't know much about these tools I'd highly recommend you check them out further. They offer a lot of neat capabilities! We will go over some of them in this article.
What are some alternatives?
axios - Promise based HTTP client for the browser and node.js
Nightmare - A high-level browser automation library.
Playwright - Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
cheerio - Fast, flexible, and lean implementation of core jQuery designed specifically for the server.
WKHTMLToPDF - Convert HTML to PDF using Webkit (QtWebKit)
karma - Spectacular Test Runner for JavaScript
phantomjs - Scriptable Headless Browser
pyppeteer - Headless chrome/chromium automation library (unofficial port of puppeteer)
Cypress - Fast, easy and reliable testing for anything that runs in a browser.
request - 🏊🏾 Simplified HTTP request client.
rendertron - A Headless Chrome rendering solution
puppeteer-extra - 💯 Teach puppeteer new tricks through plugins.