Top 23 headless-chrome Open-Source Projects

puppeteer

356 86,628 9.9 TypeScript

Node.js API for Chrome

Project mention: Learn Automated Testing At Home: A Beginner's Guide | dev.to | 2024-04-04

1.Puppeteer: Puppeteer is a Node library that provides a high-level API to control headless Chrome or Chromium using the DevTools Protocol. Key Features: More control over Chrome. Enables web scraping. Allows taking screenshots and generating PDFs for UI testing. Measures load times through the Chrome Performance Analysis tool
Nightmare

8 19,510 2.0 JavaScript

A high-level browser automation library.
SurveyJS

surveyjs.io
sponsored

Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
crawlee

29 11,948 9.8 TypeScript

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Project mention: How to scrape Amazon products | dev.to | 2024-04-01

In this guide, we'll be extracting information from Amazon product pages using the power of TypeScript in combination with the Cheerio and Crawlee libraries. We'll explore how to retrieve and extract detailed product data such as titles, prices, image URLs, and more from Amazon's vast marketplace. We'll also discuss handling potential blocking issues that may arise during the scraping process.
url-to-pdf-api

3 6,969 1.4 HTML

Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
puppeteer-extra

28 6,031 0.0 JavaScript

💯 Teach puppeteer new tricks through plugins.

Project mention: What are your favorite Data Scraping tools? | /r/dataengineering | 2023-06-22

You could use https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth A plugin to escape anti bot detection
taiko

4 3,474 1.3 JavaScript

A node.js library for testing modern web applications
puppeteer-cluster

5 3,077 6.4 TypeScript

Puppeteer Pool, run a cluster of instances in parallel
InfluxDB

www.influxdata.com
sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
serverless-chrome

2 2,829 0.0 JavaScript

🌐 Run headless Chrome/Chromium on AWS Lambda
crawlergo

1 2,744 2.6 Go

A powerful browser crawler for web vulnerability scanners

Project mention: Ethical Hacking Tool | /r/hackthebox | 2023-06-27
gowitness

5 2,669 6.6 Go

🔍 gowitness - a golang, web screenshot utility using Chrome Headless

Project mention: NMAP-formatter: convert NMAP results to HTML, CSV, JSON, graphviz (dot), SQLite | news.ycombinator.com | 2024-01-26

Very nice, another fun pentesting tool written in go is gowitness
https://github.com/sensepost/gowitness/wiki
pdf-bot

1 2,608 0.0 JavaScript

🤖 A Node queue API for generating PDFs using headless Chrome. Comes with a CLI, S3 storage and webhooks for notifying subscribers about generated PDFs
awesome-puppeteer

1 2,315 3.0

A curated list of awesome puppeteer resources.
playwright-go

8 1,758 7.5 Go

Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.
ferrum

9 1,642 8.5 Ruby

Headless Chrome Ruby API
cuprite

5 1,194 6.8 Ruby

Headless Chrome/Chromium driver for Capybara
md-to-pdf

1 1,064 7.0 TypeScript

Hackable CLI tool for converting Markdown files to PDF using Node.js and headless Chrome.
kimuraframework

5 999 0.0 Ruby

Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites
wrp

52 981 6.2 Go

Web Rendering Proxy: Use vintage, historical, legacy browsers on modern web

Project mention: Web Rendering Proxy – Use historical browsers with the modern web | news.ycombinator.com | 2023-09-17
BotD

8 898 7.8 TypeScript

Bot detection library that runs in the browser. Detects automation tools and frameworks. No server required, runs 100% on the client. MIT license, no usage restrictions.

Project mention: Download numbers on crates.io too high? | /r/rust | 2023-05-31

If the crates.io team wanted to go further they could employ some invasive methods to detect bots (usually it involves a JS library that does fingerprinting on the browser - something like BotD), but I'm not advocating for it. I don't think crates.io should collect more data, they should just perform better statistics on the data they already have.
proxy-chain

2 778 4.1 JavaScript

Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
WitnessMe

1 710 3.4 Python

Web Inventory tool, takes screenshots of webpages using Pyppeteer (headless Chrome/Chromium) and provides some extra bells & whistles to make life easier.
docker-python-chromedriver

1 634 3.1 Dockerfile

Dockerfile for running Python Selenium in headless Chrome (Python 2.7 / 3.6 / 3.7 / 3.8 / Alpine based Python / Chromedriver / Selenium / Xvfb included in different versions)
spider

1 604 9.5 Rust

The fastest web crawler written in Rust. Maintained by @a11ywatch. (by spider-rs)
WorkOS

workos.com
sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-04.

headless-chrome related posts

How and why we ripped our Open Source product apart for a full rebuild
1 project | dev.to | 28 Feb 2024
Launching Crawlee Blog: Your Node.js resource hub for web scraping and automation.
1 project | dev.to | 26 Feb 2024
Show HN: Quetta – A privacy-first web browser with enhanced ad blocker inside
2 projects | news.ycombinator.com | 18 Jan 2024
Anything like scrapy in other languages?
1 project | /r/webscraping | 10 Dec 2023
How To Enable Hardware Acceleration on Chrome, Chromium & Puppeteer on AWS in Headless mode
4 projects | dev.to | 25 Oct 2023
The 5 Node.js PDF Libraries Every Developer Must Know
2 projects | dev.to | 11 Oct 2023
A question about web-scraping
1 project | /r/learnprogramming | 1 Sep 2023
A note from our sponsor - SurveyJS
surveyjs.io | 17 Apr 2024

With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js. Learn more →

Index

What are some of the best open-source headless-chrome projects? This list will help you:

	Project	Stars
1	puppeteer	86,628
2	Nightmare	19,510
3	crawlee	11,948
4	url-to-pdf-api	6,969
5	puppeteer-extra	6,031
6	taiko	3,474
7	puppeteer-cluster	3,077
8	serverless-chrome	2,829
9	crawlergo	2,744
10	gowitness	2,669
11	pdf-bot	2,608
12	awesome-puppeteer	2,315
13	playwright-go	1,758
14	ferrum	1,642
15	cuprite	1,194
16	md-to-pdf	1,064
17	kimuraframework	999
18	wrp	981
19	BotD	898
20	proxy-chain	778
21	WitnessMe	710
22	docker-python-chromedriver	634
23	spider	604