With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js. Learn more →
Top 23 headless-chrome Open-Source Projects
-
1.Puppeteer: Puppeteer is a Node library that provides a high-level API to control headless Chrome or Chromium using the DevTools Protocol. Key Features: More control over Chrome. Enables web scraping. Allows taking screenshots and generating PDFs for UI testing. Measures load times through the Chrome Performance Analysis tool
-
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
In this guide, we'll be extracting information from Amazon product pages using the power of TypeScript in combination with the Cheerio and Crawlee libraries. We'll explore how to retrieve and extract detailed product data such as titles, prices, image URLs, and more from Amazon's vast marketplace. We'll also discuss handling potential blocking issues that may arise during the scraping process.
-
url-to-pdf-api
Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.
-
You could use https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth A plugin to escape anti bot detection
-
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
-
-
Project mention: NMAP-formatter: convert NMAP results to HTML, CSV, JSON, graphviz (dot), SQLite | news.ycombinator.com | 2024-01-26
Very nice, another fun pentesting tool written in go is gowitness
-
pdf-bot
🤖 A Node queue API for generating PDFs using headless Chrome. Comes with a CLI, S3 storage and webhooks for notifying subscribers about generated PDFs
-
-
playwright-go
Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.
-
-
-
-
kimuraframework
Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites
-
Project mention: Web Rendering Proxy – Use historical browsers with the modern web | news.ycombinator.com | 2023-09-17
-
BotD
Bot detection library that runs in the browser. Detects automation tools and frameworks. No server required, runs 100% on the client. MIT license, no usage restrictions.
If the crates.io team wanted to go further they could employ some invasive methods to detect bots (usually it involves a JS library that does fingerprinting on the browser - something like BotD), but I'm not advocating for it. I don't think crates.io should collect more data, they should just perform better statistics on the data they already have.
-
proxy-chain
Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
-
WitnessMe
Web Inventory tool, takes screenshots of webpages using Pyppeteer (headless Chrome/Chromium) and provides some extra bells & whistles to make life easier.
-
docker-python-chromedriver
Dockerfile for running Python Selenium in headless Chrome (Python 2.7 / 3.6 / 3.7 / 3.8 / Alpine based Python / Chromedriver / Selenium / Xvfb included in different versions)
-
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
headless-chrome related posts
- How and why we ripped our Open Source product apart for a full rebuild
- Launching Crawlee Blog: Your Node.js resource hub for web scraping and automation.
- Show HN: Quetta – A privacy-first web browser with enhanced ad blocker inside
- Anything like scrapy in other languages?
- How To Enable Hardware Acceleration on Chrome, Chromium & Puppeteer on AWS in Headless mode
- The 5 Node.js PDF Libraries Every Developer Must Know
- A question about web-scraping
-
A note from our sponsor - SurveyJS
surveyjs.io | 17 Apr 2024
Index
What are some of the best open-source headless-chrome projects? This list will help you:
Project | Stars | |
---|---|---|
1 | puppeteer | 86,628 |
2 | Nightmare | 19,510 |
3 | crawlee | 11,948 |
4 | url-to-pdf-api | 6,969 |
5 | puppeteer-extra | 6,031 |
6 | taiko | 3,474 |
7 | puppeteer-cluster | 3,077 |
8 | serverless-chrome | 2,829 |
9 | crawlergo | 2,744 |
10 | gowitness | 2,669 |
11 | pdf-bot | 2,608 |
12 | awesome-puppeteer | 2,315 |
13 | playwright-go | 1,758 |
14 | ferrum | 1,642 |
15 | cuprite | 1,194 |
16 | md-to-pdf | 1,064 |
17 | kimuraframework | 999 |
18 | wrp | 981 |
19 | BotD | 898 |
20 | proxy-chain | 778 |
21 | WitnessMe | 710 |
22 | docker-python-chromedriver | 634 |
23 | spider | 604 |