Python Web Crawling

Open-source Python projects categorized as Web Crawling | Edit details

Top 22 Python Web Crawling Projects

  • Scrapy

    Scrapy, a fast high-level web crawling & scraping framework for Python.

    Project mention: Legalität von Web scraping | reddit.com/r/de_EDV | 2022-01-22
  • pyspider

    A Powerful Spider(Web Crawler) System in Python.

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • requests-html

    Pythonic HTML Parsing for Humans™

    Project mention: How to make all https traffic in program go through a specific proxy? | reddit.com/r/learnpython | 2021-12-24
  • portia

    Visual scraping for Scrapy

  • MechanicalSoup

    A Python library for automating interaction with websites.

  • RoboBrowser

  • Grab

    Web Scraping Framework

  • SonarLint

    Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.

  • gain

    Web crawling framework based on asyncio.

  • PSpider

    简单易用的Python爬虫框架,QQ交流群:597510560

  • cola

    A high-level distributed crawling framework.

  • feedparser

    Parse feeds in Python

  • Sukhoi

    Minimalist and powerful Web Crawler.

  • MSpider

    Spider

  • spidy Web Crawler

    The simple, easy to use command line web crawler.

  • google-search-results-python

    Google Search Results via SERP API pip Python Package

  • Crawley

    Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.

  • brownant

    Brownant is a web data extracting framework.

  • Demiurge

    PyQuery-based scraping micro-framework.

  • Pomp

    Screen scraping and web crawling framework

  • FastImage

    Python library that finds the size / type of an image given its URI by fetching as little as needed (by bmuller)

  • microwler

    A micro-framework for asynchronous deep crawls and web scraping with Python

  • Mariner

    This a is mirror of Gitlab repository. Open your issues and pull requests there. (by radek-sprta)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-22.

Python Web Crawling related posts

Index

What are some of the best open-source Web Crawling projects in Python? This list will help you:

Project Stars
1 Scrapy 42,622
2 pyspider 15,288
3 requests-html 12,324
4 portia 8,309
5 MechanicalSoup 3,884
6 RoboBrowser 3,574
7 Grab 2,154
8 gain 2,005
9 PSpider 1,619
10 cola 1,423
11 feedparser 1,352
12 Sukhoi 881
13 MSpider 344
14 spidy Web Crawler 277
15 google-search-results-python 180
16 Crawley 172
17 brownant 157
18 Demiurge 107
19 Pomp 61
20 FastImage 28
21 microwler 9
22 Mariner 2
Find remote jobs at our new job board 99remotejobs.com. There are 30 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
OPS - Build and Run Open Source Unikernels
Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.
github.com/nanovms