Web Scraping With Python (An Ultimate Guide)

This page summarizes the projects mentioned and recommended in the original post on reddit.com/r/Python

Our great sponsors
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • Zigi - Delete the most useless function ever: context switching.
  • Scout APM - Truly a developer’s best friend
  • Sonar - Write Clean Python Code. Always.
  • parsel

    Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

    Something I don't see discussed when this topic is brought up is that Scrapy's HTML parsing library, parsel, can be installed separately from scrapy itself. You can use it in place of beautifulsoup and, imo, it's much easier to use.

  • Playwright

    Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Data Platform where developers build real-time applications for analytics, IoT and cloud-native services in less time with less code.

  • parsel-cli

    cli for evaluating css and xpath selectors

    I like it so much that I even wrote a REPL for it parsel-cli :) (it's a bit of a Frankenstein though as I'm working on a 2.0 release)

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts