SaaSHub helps you find the best software and product alternatives Learn more →
Top 8 headless-browser Open-Source Projects
-
ArchiveBox
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
html2image
A package acting as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.
-
Python-Selenium-Action
Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
phantomime
An embeddable headless browser package for Python that provides a simplified interface for interacting with web pages using Selenium and Selenium Hub.
Project mention: Ask HN: What Underrated Open Source Project Deserves More Recognition? | news.ycombinator.com | 2024-03-07Two projects I greatly appreciate, allowing me to easily archive my bandcamp and GOG purchases (after the initial setup anyways):
https://github.com/easlice/bandcamp-downloader
https://github.com/Kalanyr/gogrepoc
And I recently learned about archivebox, which I think is going to be a fast favorite and finally let me clear out my mess of tabs/bookmarks: https://github.com/ArchiveBox/ArchiveBox
You could use https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth A plugin to escape anti bot detection
scrapy-playwright is an integration between Scrapy and Playwright. It enables scraping dynamic web pages with Scrapy by processing the web scraping requests using a Playwright instance.
Project mention: Ask HN: What's your "it's not stupid if it works" story? | news.ycombinator.com | 2023-12-22It uses the headless version of Chrome/Chromium or Edge behind the scenes.
It made me realize that even big projects have features that just don't work. Edge headless wouldn't let you take screenshots up until recently, and I still encountered issues with Firefox last time I tried to add support for it in the package. I also stumbled upon weird behaviors of Chrome CDP when trying to implement an alternative to using the headless mode, and these issues eventually fixed themselves after some Chrome updates.
[1] https://github.com/vgalin/html2image
Project mention: marionette: A Selenium alternative written in Crystal | /r/crystal_programming | 2023-06-20
Project mention: Saturday Daily Thread: Resource Request and Sharing! Daily Thread | /r/Python | 2023-06-24Shared this multiple times - but always seems to be helpful to someone. A GitHub action / template to run your Selenium based scripts on GitHub with ease. https://github.com/MarketingPipeline/Python-Selenium-Action
headless-browser related posts
-
Show HN: Generate a concatenated file of all CSS used on a given website
-
Saturday Daily Thread: Resource Request and Sharing! Daily Thread
-
marionette: A Selenium alternative written in Crystal
-
youtube bandwidth throttled for cloud addresses?
-
Python-Selenium-Action: Run Selenium with Python via Github Actions!
-
Python-Selenium-Action: Easily Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!
-
Ask HN: What's the best way to get all the HTML from a JavaScript site?
-
A note from our sponsor - SaaSHub
www.saashub.com | 15 May 2024
Index
What are some of the best open-source headless-browser projects? This list will help you:
Project | Stars | |
---|---|---|
1 | ArchiveBox | 19,915 |
2 | puppeteer-extra | 6,122 |
3 | scrapy-playwright | 847 |
4 | html2image | 322 |
5 | marionette | 177 |
6 | Python-Selenium-Action | 150 |
7 | autoprotonvpn | 10 |
8 | phantomime | 8 |
Sponsored