headless-chrome

Open-source projects categorized as headless-chrome

Top 23 headless-chrome Open-Source Projects

  • puppeteer

    Node.js API for Chrome

    Project mention: Learn Automated Testing At Home: A Beginner's Guide | dev.to | 2024-04-04

    1.Puppeteer: Puppeteer is a Node library that provides a high-level API to control headless Chrome or Chromium using the DevTools Protocol. Key Features: More control over Chrome. Enables web scraping. Allows taking screenshots and generating PDFs for UI testing. Measures load times through the Chrome Performance Analysis tool

  • Nightmare

    A high-level browser automation library.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

  • crawlee

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

    Project mention: How to scrape Amazon products | dev.to | 2024-04-01

    In this guide, we'll be extracting information from Amazon product pages using the power of TypeScript in combination with the Cheerio and Crawlee libraries. We'll explore how to retrieve and extract detailed product data such as titles, prices, image URLs, and more from Amazon's vast marketplace. We'll also discuss handling potential blocking issues that may arise during the scraping process.

  • url-to-pdf-api

    Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.

  • puppeteer-extra

    💯 Teach puppeteer new tricks through plugins.

    Project mention: What are your favorite Data Scraping tools? | /r/dataengineering | 2023-06-22

    You could use https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth A plugin to escape anti bot detection

  • taiko

    A node.js library for testing modern web applications

  • puppeteer-cluster

    Puppeteer Pool, run a cluster of instances in parallel

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • serverless-chrome

    🌐 Run headless Chrome/Chromium on AWS Lambda

  • crawlergo

    A powerful browser crawler for web vulnerability scanners

    Project mention: Ethical Hacking Tool | /r/hackthebox | 2023-06-27
  • gowitness

    🔍 gowitness - a golang, web screenshot utility using Chrome Headless

    Project mention: NMAP-formatter: convert NMAP results to HTML, CSV, JSON, graphviz (dot), SQLite | news.ycombinator.com | 2024-01-26

    Very nice, another fun pentesting tool written in go is gowitness

    https://github.com/sensepost/gowitness/wiki

  • pdf-bot

    🤖 A Node queue API for generating PDFs using headless Chrome. Comes with a CLI, S3 storage and webhooks for notifying subscribers about generated PDFs

  • awesome-puppeteer

    A curated list of awesome puppeteer resources.

  • playwright-go

    Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.

  • ferrum

    Headless Chrome Ruby API

  • cuprite

    Headless Chrome/Chromium driver for Capybara

  • md-to-pdf

    Hackable CLI tool for converting Markdown files to PDF using Node.js and headless Chrome.

  • kimuraframework

    Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites

  • wrp

    Web Rendering Proxy: Use vintage, historical, legacy browsers on modern web

    Project mention: Web Rendering Proxy – Use historical browsers with the modern web | news.ycombinator.com | 2023-09-17
  • BotD

    Bot detection library that runs in the browser. Detects automation tools and frameworks. No server required, runs 100% on the client. MIT license, no usage restrictions.

    Project mention: Download numbers on crates.io too high? | /r/rust | 2023-05-31

    If the crates.io team wanted to go further they could employ some invasive methods to detect bots (usually it involves a JS library that does fingerprinting on the browser - something like BotD), but I'm not advocating for it. I don't think crates.io should collect more data, they should just perform better statistics on the data they already have.

  • proxy-chain

    Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.

  • WitnessMe

    Web Inventory tool, takes screenshots of webpages using Pyppeteer (headless Chrome/Chromium) and provides some extra bells & whistles to make life easier.

  • docker-python-chromedriver

    Dockerfile for running Python Selenium in headless Chrome (Python 2.7 / 3.6 / 3.7 / 3.8 / Alpine based Python / Chromedriver / Selenium / Xvfb included in different versions)

  • spider

    The fastest web crawler written in Rust. Maintained by @a11ywatch. (by spider-rs)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-04.

headless-chrome related posts

Index

What are some of the best open-source headless-chrome projects? This list will help you:

Project Stars
1 puppeteer 86,628
2 Nightmare 19,510
3 crawlee 11,948
4 url-to-pdf-api 6,969
5 puppeteer-extra 6,031
6 taiko 3,474
7 puppeteer-cluster 3,077
8 serverless-chrome 2,829
9 crawlergo 2,744
10 gowitness 2,669
11 pdf-bot 2,608
12 awesome-puppeteer 2,315
13 playwright-go 1,758
14 ferrum 1,642
15 cuprite 1,194
16 md-to-pdf 1,064
17 kimuraframework 999
18 wrp 981
19 BotD 898
20 proxy-chain 778
21 WitnessMe 710
22 docker-python-chromedriver 634
23 spider 604
The modern identity platform for B2B SaaS
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
workos.com