Scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python. (by scrapy)
playwright-python
Python version of the Playwright testing and automation library. (by microsoft)
Scrapy | playwright-python | |
---|---|---|
189 | 35 | |
57,527 | 13,391 | |
3.7% | 1.6% | |
9.7 | 9.2 | |
7 days ago | 1 day ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Scrapy
Posts with mentions or reviews of Scrapy.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2025-01-16.
- Scrapy needs to have sane defaults that do no harm
-
Top 10 Tools for Efficient Web Scraping in 2025
Scrapy is a robust and scalable open-source web crawling framework. It is highly efficient for large-scale projects and supports asynchronous scraping.
-
11 best open-source web crawlers and scrapers in 2024
Language: Python | GitHub: 52.9k stars | link
-
Current problems and mistakes of web scraping in Python and tricks to solve them!
One might ask, what about Scrapy? I'll be honest: I don't really keep up with their updates. But I haven't heard about Zyte doing anything to bypass TLS fingerprinting. So out of the box Scrapy will also be blocked, but nothing is stopping you from using curl_cffi in your Scrapy Spider.
- Scrapy, a fast high-level web crawling and scraping framework for Python
-
Automate Spider Creation in Scrapy with Jinja2 and JSON
Install scrapy (Offical website) either using pip or conda (Follow for detailed instructions):
-
Analyzing Svenskalag Data using DBT and DuckDB
Using Scrapy I fetched the data needed (activities and attendance). Scrapy handled authentication using a form request in a very simple way:
-
Scrapy Vs. Crawlee
Scrapy is an open-source Python-based web scraping framework that extracts data from websites. With Scrapy, you create spiders, which are autonomous scripts to download and process web content. The limitation of Scrapy is that it does not work very well with JavaScript rendered websites, as it was designed for static HTML pages. We will do a comparison later in the article about this.
- Claude is now available in Europe
- Scrapy: A Fast and Powerful Scraping and Web Crawling Framework
playwright-python
Posts with mentions or reviews of playwright-python.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2025-04-30.
-
How to scrape TikTok using Python
TikTok uses quite a lot of JavaScript on its site, both for displaying content and for analyzing user behavior, including detecting and blocking crawlers. Therefore, for crawling TikTok, we'll use a headless browser with Playwright.
- reviewing prelude's django starter template(by Sheena O'Connell)
-
Google and Anthropic are working on AI agents - so I made an open source alternative
Integrating Ollama, Microsoft vision models and Playwright I've made a simple agent that can browse websites and data to answer your query.
-
Ask HN: How to remove Ads from a downloaded HTML file to output an ad free file?
Do you have to use Curl? It wouldn't render a lot of sites correctly anyway (anything that uses JS for rendering).
Can you run a puppeteer/playwright instance and add an ad blocker to that? e.g. https://github.com/ghostery/adblocker or https://github.com/microsoft/playwright-python/issues/782
-
Scrape Google Flights with Python
Playwright
-
Login for web-scraping help
An alternative is to use a package like playwright (or Selenium) to run a browser remotely and login.
-
Show HN: Use cookies from Chrome (CDP) in cURL without copy pasting
Using the tools at hand is often the best approach. That said, I've spent most of the last 13 years of my career automating browsers. For years, I used Selenium with a variety of libraries. After switching to Puppeteer/Playwright, I have zero interest in going back lol. Playwright actually has first party Python support. (Puppeteer has a port called Pyppeteer, but it's no longer maintained and the author recommends using Playwright)
https://playwright.dev/python/
- Any extension to automate workflow in automatic1111?
- Can Requests be used to make a call to a js script? Need some guidance.
-
I can't find any good Python Selenium tutorials out there. Anyone got any good links to video tutorials or even dcoumentatniton?
This is pretty great for web automation https://playwright.dev/python/
What are some alternatives?
When comparing Scrapy and playwright-python you can also consider the following projects:
requests-html - Pythonic HTML Parsing for Humans™
Playwright - Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
MechanicalSoup - A Python library for automating interaction with websites.
pyppeteer - Headless chrome/chromium automation library (unofficial port of puppeteer)
pyspider - A Powerful Spider(Web Crawler) System in Python.
playwright-dotnet - .NET version of the Playwright testing and automation library.