Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Go Crawler Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Rendora
dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites
-
cariddi
Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
-
till
DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
-
nebula
🌌 A network agnostic DHT crawler, monitor, and measurement tool that exposes timely information about DHT networks. (by dennis-tra)
-
dorkscout
DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets
-
spidy
Domain names collector - Crawl websites and collect domain names along with their availability status. (by twiny)
-
node-crawler
Attempts to crawl the Ethereum network of valid Ethereum execution nodes and visualizes them in a nice web dashboard. (by ethereum)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Not a fix, but I tend to use lux when downloading from bilibili. It is faster too.
SerpApi focuses on scraping search results. That's why we need extra help to scrape individual sites. We'll use GoColly package.
Project mention: Show HN: I scraped 25M Shopify products to build a search engine | news.ycombinator.com | 2023-12-13As someone who has scraped millions of items myself, I had success using Geziyor (https://github.com/geziyor/geziyor) built in Go. Shopify sites are especially easy to scrape because they tend to share the same product data formatting and don't hide it behind JS rendering.
Project mention: Show HN: Nebula – A network agnostic DHT crawler | news.ycombinator.com | 2024-03-20
I've developed yet another solution that can help you extract data from web archives :) You can use it as a separate tool, or import it into your Go project. Github: https://github.com/karust/gogetcrawl
Go Crawler related posts
- Show HN: Nebula – A network agnostic DHT crawler
- Modern automated data miner (scrapper)
- Scraping the full snippet from Google search result
- Show HN: I scraped 25M Shopify products to build a search engine
- Show HN: Flyscrape – A standalone and scriptable web scraper in Go
- New webcrawler for bug-hunters and data-miners
- Colly: Elegant Scraper and Crawler Framework for Golang
-
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024
Index
What are some of the best open-source Crawler projects in Go? This list will help you:
Project | Stars | |
---|---|---|
1 | lux | 25,147 |
2 | colly | 22,120 |
3 | crawlab | 10,788 |
4 | katana | 8,661 |
5 | Pholcus | 7,504 |
6 | Ferret | 5,616 |
7 | crawlergo | 2,746 |
8 | Geziyor | 2,472 |
9 | Rendora | 1,993 |
10 | cariddi | 1,351 |
11 | go-dork | 975 |
12 | till | 807 |
13 | webpalm | 323 |
14 | nebula | 279 |
15 | antch | 255 |
16 | crawley | 227 |
17 | dorkscout | 216 |
18 | ChainWalker | 193 |
19 | slrp | 147 |
20 | spidy | 139 |
21 | gogetcrawl | 124 |
22 | seonaut | 121 |
23 | node-crawler | 107 |
Sponsored