playwright-python vs ArchiveBox

playwright-python

Python version of the Playwright testing and automation library. (by microsoft)

Source Code

playwright.dev

Suggest alternative

Edit details

ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more... (by ArchiveBox)

Source Code

archivebox.io

Docs

Suggest alternative

Edit details

Sevalla - Deploy and host your apps and databases, now with $50 credit!

Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

sevalla.com

featured

InfluxDB – Built for High-Performance Time Series Workloads

InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

www.influxdata.com

featured

playwright-python		ArchiveBox
	Project
35	Mentions	269
13,613	Stars	24,882
1.1%	Growth	1.7%
9.2	Activity	9.8
5 days ago	Latest Commit	4 months ago
Python	Language	Python
Apache License 2.0	License	MIT

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

playwright-python

Posts with mentions or reviews of playwright-python. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-04-30.

How to scrape TikTok using Python
5 projects | dev.to | 30 Apr 2025

TikTok uses quite a lot of JavaScript on its site, both for displaying content and for analyzing user behavior, including detecting and blocking crawlers. Therefore, for crawling TikTok, we'll use a headless browser with Playwright.
reviewing prelude's django starter template(by Sheena O'Connell)
7 projects | dev.to | 25 Apr 2025
Google and Anthropic are working on AI agents - so I made an open source alternative
3 projects | dev.to | 9 Jan 2025

Integrating Ollama, Microsoft vision models and Playwright I've made a simple agent that can browse websites and data to answer your query.
Ask HN: How to remove Ads from a downloaded HTML file to output an ad free file?
3 projects | news.ycombinator.com | 8 Nov 2024

Do you have to use Curl? It wouldn't render a lot of sites correctly anyway (anything that uses JS for rendering).
Can you run a puppeteer/playwright instance and add an ad blocker to that? e.g. https://github.com/ghostery/adblocker or https://github.com/microsoft/playwright-python/issues/782
Scrape Google Flights with Python
1 project | dev.to | 21 Apr 2023

Playwright
Login for web-scraping help
1 project | /r/learnpython | 3 Apr 2023

An alternative is to use a package like playwright (or Selenium) to run a browser remotely and login.
Show HN: Use cookies from Chrome (CDP) in cURL without copy pasting
5 projects | news.ycombinator.com | 1 Apr 2023

Using the tools at hand is often the best approach. That said, I've spent most of the last 13 years of my career automating browsers. For years, I used Selenium with a variety of libraries. After switching to Puppeteer/Playwright, I have zero interest in going back lol. Playwright actually has first party Python support. (Puppeteer has a port called Pyppeteer, but it's no longer maintained and the author recommends using Playwright)
https://playwright.dev/python/
Any extension to automate workflow in automatic1111?
1 project | /r/StableDiffusion | 29 Mar 2023
Can Requests be used to make a call to a js script? Need some guidance.
3 projects | /r/learnpython | 25 Mar 2023
I can't find any good Python Selenium tutorials out there. Anyone got any good links to video tutorials or even dcoumentatniton?
1 project | /r/softwaretesting | 1 Mar 2023

This is pretty great for web automation https://playwright.dev/python/

ArchiveBox

Posts with mentions or reviews of ArchiveBox. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-05-01.

Linkwarden: FOSS self-hostable bookmarking with AI-tagging and page archival
21 projects | news.ycombinator.com | 1 May 2025

I've used https://historio.us since 2011 and still pay for it to keep access to all the pages I've archived over the years. The price has been kept low enough that I can't bring myself to cancel it even though I've been using self-hosted https://archivebox.io/ for the last few years.
I always include an archived link whenever I reference something in documentation. That's my main use at the moment.
However, I also feel like I've gotten a lot of really good value when trying to learn a new development topic. Whenever I find something that looks like it might be useful, I archive it and, because everything is searchable, I end up with a searchable index of really high quality content once I actually know what I'm doing.
I find it hard to rediscover content via web search these days and there's so much churn that having a personal archive of useful content is going to increase in value, at least in my opinion.
Links copied from project READMEs now add "?tab=readme-ov-file" query parameter
1 project | news.ycombinator.com | 22 Mar 2025

The links the reporter are trying to use already don't work on mobile. If you want to link to the README file, link to the README file, e.g. https://github.com/ArchiveBox/ArchiveBox/blob/dev/README.md
I'll concede that this latter link is much longer than it perhaps should be, but I don't think the links the reporter used previous should have ever been used as they don't work for a lot of people.
Small Archives
1 project | news.ycombinator.com | 19 Mar 2025
Ask HN: How Do You Bookmark?
2 projects | news.ycombinator.com | 9 Jan 2025

2. Drop the link into my instance of ArchiveBox [0] and will return to it a few weeks/months later or, more often than not, never again
[0] https://archivebox.io/
Is stuff online worth saving?
5 projects | news.ycombinator.com | 21 Dec 2024

I use https://github.com/gildas-lormeau/SingleFile
I set it to tolerate longer processing times, and to open the file after saving so I can sanity check that it got everything. Works great at faithfully saving a page with images as it appears in browser, and saves so much time.
You might also have a look at https://github.com/ArchiveBox/ArchiveBox
Ask HN: How to remove Ads from a downloaded HTML file to output an ad free file?
3 projects | news.ycombinator.com | 8 Nov 2024

After taking a break and stepping away for a bit, I realized that I was recreating an archiving system for websites and that there are existing solutions that do the same thing.
I found https://github.com/ArchiveBox/ArchiveBox/ which is a self hosted web archiving system. It covers most of my usecases (and I can extend it for additional functionality) so I am going to set this up and try it out.
Thanks all for the help.
Internet Archive breached again through stolen access tokens
2 projects | news.ycombinator.com | 20 Oct 2024

Is anyone using ArchiveBox regularly? It's a self-hosted archiving solution. Not the ambitious decentralized system I think this comment is thinking of but a practical way for someone to run an archive for themselves. https://archivebox.io/
ArchiveBox is evolving: the future of self-hosted internet archives
14 projects | news.ycombinator.com | 16 Oct 2024
Tell HN: The Wayback Machine is up, in read-only mode
1 project | news.ycombinator.com | 16 Oct 2024
Web Archiving Projects
1 project | news.ycombinator.com | 27 Sep 2024

What are some alternatives?

When comparing playwright-python and ArchiveBox you can also consider the following projects:

Playwright - Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

linkwarden - ⚡️⚡️⚡️ Self-hosted collaborative bookmark manager to collect, read, annotate, and fully preserve what matters, all in one place.

Scrapy - Scrapy, a fast high-level web crawling & scraping framework for Python.

Wallabag - wallabag is a self hostable application for saving web pages: Save and classify articles. Read them later. Freely.

pyppeteer - Headless chrome/chromium automation library (unofficial port of puppeteer)

SingleFile - Web Extension for saving a faithful copy of a complete web page in a single HTML file

playwright-python vs Playwright ArchiveBox vs linkwarden playwright-python vs Scrapy ArchiveBox vs Wallabag playwright-python vs pyppeteer ArchiveBox vs SingleFile

Sevalla - Deploy and host your apps and databases, now with $50 credit!

Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

sevalla.com

featured

InfluxDB – Built for High-Performance Time Series Workloads

InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

www.influxdata.com

featured

Compare playwright-python vs ArchiveBox and see what are their differences.

playwright-python

ArchiveBox

playwright-python

ArchiveBox

What are some alternatives?