Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I did some in the past by writing lots of python boilerplate around requests for the HTTP requests and lxml for the parsing, but I think today you can go pretty far with a specialized framework like scrapy: https://scrapy.org/
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- Scrapy: A Fast and Powerful Scraping and Web Crawling Framework
- Implementing case sensitive headers in Scrapy (not through `_caseMappings`)
- Dicas para projetos usando web scraping
- Best tools to use for web scraping ??
- I'm using python to scrape web page content and extract keywords, how can I make it faster to process?