Web Scraping in Python: Avoid Detection Like a Ninja

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • puppeteer

    Node.js API for Chrome

  • Selenium, Puppeteer, and Playwright are the most used and known libraries. Avoiding them for performance reasons would be preferable, and they'll make scraping slower. But sometimes, there is no alternative.

  • colly

    Elegant Scraper and Crawler Framework for Golang

  • We could write some snippets mixing all these, but the best option in real life is to use a tool with it all, like Scrapy, pyspider, node-crawler (Node.js), or Colly (Go).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Scraping the full snippet from Google search result

    3 projects | dev.to | 1 Jan 2024
  • Show HN: Flyscrape – A standalone and scriptable web scraper in Go

    6 projects | news.ycombinator.com | 11 Nov 2023
  • Colly: Elegant Scraper and Crawler Framework for Golang

    1 project | news.ycombinator.com | 23 Aug 2023
  • Announcing GoWatch 1.0.0!

    3 projects | /r/selfhosted | 22 Apr 2023
  • colly VS scrapemate - a user suggested alternative

    2 projects | 15 Apr 2023