Web Scraping with Python: Everything you need to know

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • scrapy-playwright

    🎭 Playwright integration for Scrapy

  • You can use something like scrapy-playwright[0] to run a headless browser framework as your download handler. I think there are versions for some of the other headless systems, if you prefer those.

    [0] https://github.com/scrapy-plugins/scrapy-playwright

  • YouTube.js

    A wrapper around YouTube's internal API — reverse engineering InnerTube

  • Love this approach. We can just bypass all the normal web scraping and get the structured data straight from the source. These APIs are usually no less stable than the ever changing HTML structure anyways.

    Case study: YouTube.js

    https://news.ycombinator.com/item?id=31021611

    https://github.com/LuanRT/YouTube.js

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Web Scraping Dynamic Websites With Scrapy Playwright

    1 project | dev.to | 6 Mar 2024
  • Scrapy & splash guide

    1 project | /r/learnpython | 18 Feb 2023
  • which libraries/frameworks could be used for page interaction?

    2 projects | /r/webscraping | 6 Nov 2022
  • Implementing a Selenium backend on a web app?

    1 project | /r/webscraping | 8 Oct 2022
  • Scraping Dynamic Javascript Websites with Scrapy and Scrapy-playwright

    2 projects | dev.to | 14 Jun 2022