Top 10 Python web-crawler Projects
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
omniparse
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Project mention: Show HN: I Made an Open Source Platform for Structuring Any Unstructured Data | news.ycombinator.com | 2024-07-02 -
crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
Which hashtags are trending now? What is an influencer's engagement rate? What topics are important for a content creator? You can find answers to these and many other questions by analyzing TikTok data. However, for analysis, you need to extract the data in a convenient format. In this blog, we'll explore how to scrape TikTok using Crawlee for Python.
-
-
-
-
Ignareo-ISML-auto-voter
Ignareo the Carillon, a web crawler/spider template of ultimate high concurrency built for leprechauns. Carillons as the best web spiders; Long live the golden years of leprechauns! (ISML=international saimoe; 2022 ISML is last ISML)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
Python
This repository contains the python source code, containing more than 40 python projects, involving many fields.仓库用于储存python源代码, 包含40多个python项目,涉及爬虫、算法、OpenGL、tkinter、面向对象编程等多个领域。 (by qfcy)
-
CobWeb-lnx
CobWeb is a Python library for web scraping. The library consists of two classes: Spider and Scraper.
Python web-crawler discussion
Python web-crawler related posts
Index
What are some of the best open-source web-crawler projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | Scrapegraph-ai | 20,030 |
2 | omniparse | 6,589 |
3 | crawlee-python | 5,749 |
4 | PSpider | 1,839 |
5 | kochat | 455 |
6 | spidy Web Crawler | 347 |
7 | Ignareo-ISML-auto-voter | 187 |
8 | GoodreadsScraper | 138 |
9 | Python | 57 |
10 | CobWeb-lnx | 38 |