Python Scrapy

Open-source Python projects categorized as Scrapy

Top 23 Python Scrapy Projects

  • scrapy-redis

    Redis-based components for Scrapy.

  • Project mention: How to make scrapy run multiple times on the same URLs? | /r/scrapy | 2023-06-26
  • Gerapy

    Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • scrapy-splash

    Scrapy+Splash for JavaScript integration

  • scrapydweb

    Web app for Scrapyd cluster management, Scrapy log analysis & visualization, Auto packaging, Timer tasks, Monitor & Alert, and Mobile UI. DEMO :point_right:

  • SpiderKeeper

    admin ui for scrapy/open source scrapinghub

  • advertools

    advertools - online marketing productivity and analysis tools

  • scrapy-playwright

    🎭 Playwright integration for Scrapy

  • Project mention: Web Scraping Dynamic Websites With Scrapy Playwright | dev.to | 2024-03-06

    scrapy-playwright is an integration between Scrapy and Playwright. It enables scraping dynamic web pages with Scrapy by processing the web scraping requests using a Playwright instance.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • scrapyrt

    HTTP API for Scrapy spiders

  • scrapy-rotating-proxies

    use multiple proxies with Scrapy

  • scrapy-fake-useragent

    Random User-Agent middleware based on fake-useragent

  • alltheplaces

    A set of spiders and scrapers to extract location information from places that post their location on the internet.

  • Project mention: Differentiating between hypermarkets and supermarkets. | /r/openstreetmap | 2023-12-09

    Maybe a different approach? https://www.alltheplaces.xyz/ has stores grouped by name

  • estela

    estela, an elastic web scraping cluster 🕸

  • GoodreadsScraper

    Scrape data from Goodreads using Scrapy and Selenium :books:

  • scrapy-cloudflare-middleware

    A Scrapy middleware to bypass the CloudFlare's anti-bot protection

  • scrapy-crawl-once

    Scrapy middleware which allows to crawl only new content

  • open-gov-crawlers

    Parse government documents into well formed JSON

  • scrapy-mysql-pipeline

    scrapy mysql pipeline

  • scrapeops-scrapy-sdk

    Scrapy extension that gives you all the scraping monitoring, alerting, scheduling, and data validation you will need straight out of the box.

  • Project mention: Distribution of gross and net salaries on r/BESalary [OC] | /r/BESalary | 2023-07-01

    My favourite scrapingtool is Scrappy, requires some Python knowledge but there are some very good tutorials about it on https://scrapeops.io

  • scrapingant-client-python

    ScrapingAnt API client for Python.

  • burplist

    Web crawler for Burplist, a search engine for craft beers in Singapore

  • hltv-scraping

    Scraping data from hltv.org

  • nse-stock-scraper

    This is Web Scraper utilizing Scrapy Framework, MongoDB and AfricasTalking to get stock prices for companies listed on the Nairobi Stock Exchange. This project will store ticker name and price as well notify via SMS once properly setup via AfricasTalking.

  • NSFW_Scraper

    Scraper to get Meta-data of all available scenes and movies and storing it to Postgresql every few days.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Scrapy related posts

Index

What are some of the best open-source Scrapy projects in Python? This list will help you:

Project Stars
1 scrapy-redis 5,451
2 Gerapy 3,210
3 scrapy-splash 3,051
4 scrapydweb 3,001
5 SpiderKeeper 2,704
6 advertools 1,055
7 scrapy-playwright 828
8 scrapyrt 816
9 scrapy-rotating-proxies 705
10 scrapy-fake-useragent 681
11 alltheplaces 528
12 estela 153
13 GoodreadsScraper 115
14 scrapy-cloudflare-middleware 102
15 scrapy-crawl-once 77
16 open-gov-crawlers 61
17 scrapy-mysql-pipeline 48
18 scrapeops-scrapy-sdk 36
19 scrapingant-client-python 31
20 burplist 11
21 hltv-scraping 10
22 nse-stock-scraper 10
23 NSFW_Scraper 8

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com