The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 23 Puppeteer Open-Source Projects
-
crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
browserless
Deploy headless browsers in Docker. Run on our cloud or bring your own. Free for non-commercial uses.
-
url-to-pdf-api
Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
-
gotenberg
A developer-friendly API for converting numerous document formats into PDF files, and more!
-
venom
Venom is a high-performance system developed with JavaScript to create a bot for WhatsApp, support for creating any interaction, such as customer service, media sending, sentence recognition based on artificial intelligence and all types of design architecture for WhatsApp.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
browser-fingerprinting
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
-
unlighthouse
Scan your entire site with Google Lighthouse in 2 minutes (on average). Open source, fully configurable with minimal setup.
-
pwa-asset-generator
Automates PWA asset generation and image declaration. Automatically generates icon and splash screen images, favicons and mstile images. Updates manifest.json and index.html files with the generated images according to Web App Manifest specs and Apple Human Interface guidelines.
-
free-games-claimer
Automatically claims free games on the Epic Games Store, Amazon Prime Gaming and GOG.
-
Rendora
dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: How SingleFile Transformed My Obsidian Workflow | news.ycombinator.com | 2024-01-26That's interesting. I have been saving articles as PDF files, which is browser-independent, but useful just for search and reference, a nuisance to quote/copy-and-paste.
If I search only the computer, I don't get results from EBay and Amazon at the top. The idea of keeping the knowledge base separate from the primary notes is a good idea. In my case, that knowledge base is the file system, and the primary notes are whatever I choose.
When I was using Evernote, the inbox was the knowledge base and notebooks were the focus. I just had too many different potential projects going on to manage this well.
Looking to focus.
I'll revisit Firefox and SingleFile.
Explanation of the zip file inside.
https://github.com/gildas-lormeau/SingleFile/blob/master/faq...
In this guide, we'll be extracting information from Amazon product pages using the power of TypeScript in combination with the Cheerio and Crawlee libraries. We'll explore how to retrieve and extract detailed product data such as titles, prices, image URLs, and more from Amazon's vast marketplace. We'll also discuss handling potential blocking issues that may arise during the scraping process.
Project mention: How and why we ripped our Open Source product apart for a full rebuild | dev.to | 2024-02-28The core product is managed, cloud hosted browsers. We run thousands at a time using AWS and DigitalOcean, for people to use with Puppeteer and Playwright scripts. Our container is also available to self deploy under an open-source license.
Use a server-side headless browser such as puppeteer to convert the HTML to PDF. This is the most reliable free option, but requires a server. If you need to use it in production, we recommend you use Gotenberg.
You could use https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth A plugin to escape anti bot detection
Project mention: Scraping Google trends, and incomplete datasets. Help, please? | /r/datasets | 2023-12-07What i didnt tried: - scraping and using these (single page) tokens - headless browser - web-test-frameworks like selenium (programmable browser) - using Flaresolver (my best bet) https://github.com/FlareSolverr/FlareSolverr . A headless browser / proxy developed to bypass cloudflare. You can easily deploy it onprem with docker. I know google got its own defence machanisms, but i've got very good experience using it for scraping and crawling (at least cloudflare protected) websites. So i guess its very good at pretending being a normal browser, being a normal user.
Project mention: The Case Against AI Everything, Everywhere, All at Once | news.ycombinator.com | 2023-10-19You can still choose automation. The easier route for me is to use wallabag to save the article. Then on my remarkable tablet I can grab a very readable document with https://github.com/koreader/koreader.
The other option is to use https://github.com/danburzo/percollate to convert a webpage to a nice document directly. I use both tools depending on my needs.
Project mention: A site that tracks the price of a Big Mac in every US McDonald's | news.ycombinator.com | 2024-01-13Yes, there is a lot written about it. Here is one link I have saved:
I encourage you to experiment with the Unlighthouse CLI to see how it can meet your specific needs. Here is the link to official docs.
Project mention: Pyppeteer Tutorial: The Ultimate Guide to Using Puppeteer with Python | dev.to | 2024-02-05The latest version of Pyppeteer, i.e., 1.0.2, can also be installed by executing pip3 install -U git+https://github.com/pyppeteer/pyppeteer@dev on the terminal.
PuppeteerSharp
Project mention: How To Generate Icons for a Progressive Web App from SVG File With a Single Command | dev.to | 2023-07-30To generate icons, we use pwa-asset-generator. The first command generates a favicon icon with a transparent background, the second one creates all the necessary icons for a progressive web app, and the third one creates images for splash screens. The last command is optional, in case you have an icon for dark mode.
GitHub - vogler/free-games-claimer: Automatically claims free games on the Epic Games Store, Amazon Prime Gaming and GOG.
Puppeteer related posts
- Ask HN: What was an interesting project you started and finished over a weekend?
- How and why we ripped our Open Source product apart for a full rebuild
- Eleve o nível de suas Aplicações Javascript com Load Test
- A single tab web browser, no client-side JavaScript, over MJPEG from pptr
- A site that tracks the price of a Big Mac in every US McDonald's
- Scraping Google trends, and incomplete datasets. Help, please?
- Is this github safe to use?
-
A note from our sponsor - WorkOS
workos.com | 20 Apr 2024
Index
What are some of the best open-source Puppeteer projects? This list will help you:
Project | Stars | |
---|---|---|
1 | SingleFile | 13,604 |
2 | crawlee | 12,044 |
3 | browserless | 7,842 |
4 | url-to-pdf-api | 6,969 |
5 | gotenberg | 6,693 |
6 | puppeteer-extra | 6,031 |
7 | venom | 5,699 |
8 | FlareSolverr | 5,608 |
9 | percollate | 4,103 |
10 | browser-fingerprinting | 3,830 |
11 | unlighthouse | 3,526 |
12 | jest-puppeteer | 3,519 |
13 | pyppeteer | 3,393 |
14 | qawolf | 3,273 |
15 | PuppeteerSharp | 3,149 |
16 | chrome-aws-lambda | 3,135 |
17 | puppeteer-cluster | 3,077 |
18 | page-skeleton-webpack-plugin | 2,780 |
19 | pwa-asset-generator | 2,626 |
20 | penthouse | 2,618 |
21 | awesome-puppeteer | 2,315 |
22 | free-games-claimer | 2,038 |
23 | Rendora | 1,992 |
Sponsored