Top 23 Go Crawler Projects

lux

13 25,147 7.0 Go

👾 Fast and simple video download library and CLI tool written in Go

Project mention: Bilibili download stalls at around 30-60% | /r/youtubedl | 2023-05-18

Not a fix, but I tend to use lux when downloading from bilibili. It is faster too.

colly

39 22,120 6.0 Go

Elegant Scraper and Crawler Framework for Golang

Project mention: Scraping the full snippet from Google search result | dev.to | 2024-01-01

SerpApi focuses on scraping search results. That's why we need extra help to scrape individual sites. We'll use GoColly package.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
crawlab

4 10,788 6.8 Go

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架
katana

9 8,661 9.1 Go

A next-generation crawling and spidering framework.
Pholcus

0 7,504 0.0 Go

Pholcus is a distributed high-concurrency crawler software written in pure golang
Ferret

0 5,616 3.1 Go

Declarative web scraping
crawlergo

1 2,746 2.6 Go

A powerful browser crawler for web vulnerability scanners

Project mention: Ethical Hacking Tool | /r/hackthebox | 2023-06-27

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Geziyor

2 2,472 0.6 Go

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

Project mention: Show HN: I scraped 25M Shopify products to build a search engine | news.ycombinator.com | 2023-12-13

As someone who has scraped millions of items myself, I had success using Geziyor (https://github.com/geziyor/geziyor) built in Go. Shopify sites are especially easy to scrape because they tend to share the same product data formatting and don't hide it behind JS rendering.

Rendora

2 1,993 0.0 Go

dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites
cariddi

7 1,351 6.7 Go

Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
go-dork

4 975 3.3 Go

The fastest dork scanner written in Go.
till

5 807 1.8 Go

DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
webpalm

11 323 8.2 Go

🕸️ Crawl in the web network

Project mention: Modern automated data miner (scrapper) | news.ycombinator.com | 2024-02-08

nebula

10 279 8.9 Go

🌌 A network agnostic DHT crawler, monitor, and measurement tool that exposes timely information about DHT networks. (by dennis-tra)

Project mention: Show HN: Nebula – A network agnostic DHT crawler | news.ycombinator.com | 2024-03-20

antch

0 255 0.0 Go

Antch, a fast, powerful and extensible web crawling & scraping framework for Go
crawley

8 227 6.7 Go

The unix-way web crawler (by s0rg)
dorkscout

14 216 0.0 Go

DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets
ChainWalker

1 193 5.9 Go

Rapid Smart Contract Crawler
slrp

2 147 8.3 Go

rotating open proxy multiplexer
spidy

4 139 0.0 Go

Domain names collector - Crawl websites and collect domain names along with their availability status. (by twiny)
gogetcrawl

1 124 5.2 Go

Extract web archive data using Wayback Machine and Common Crawl

Project mention: A tool/package for Web Archive data extraction | /r/golang | 2023-05-31

I've developed yet another solution that can help you extract data from web archives :) You can use it as a separate tool, or import it into your Go project. Github: https://github.com/karust/gogetcrawl

seonaut

3 121 8.9 Go

Open source SEO auditing tool.
node-crawler

1 107 7.7 Go

Attempts to crawl the Ethereum network of valid Ethereum execution nodes and visualizes them in a nice web dashboard. (by ethereum)
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go Crawler related posts

Show HN: Nebula – A network agnostic DHT crawler
1 project | news.ycombinator.com | 20 Mar 2024
Modern automated data miner (scrapper)
1 project | news.ycombinator.com | 8 Feb 2024
Scraping the full snippet from Google search result
3 projects | dev.to | 1 Jan 2024
Show HN: I scraped 25M Shopify products to build a search engine
4 projects | news.ycombinator.com | 13 Dec 2023
Show HN: Flyscrape – A standalone and scriptable web scraper in Go
6 projects | news.ycombinator.com | 11 Nov 2023
New webcrawler for bug-hunters and data-miners
1 project | news.ycombinator.com | 18 Oct 2023
Colly: Elegant Scraper and Crawler Framework for Golang
1 project | news.ycombinator.com | 23 Aug 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Crawler projects in Go? This list will help you:

	Project	Stars
1	lux	25,147
2	colly	22,120
3	crawlab	10,788
4	katana	8,661
5	Pholcus	7,504
6	Ferret	5,616
7	crawlergo	2,746
8	Geziyor	2,472
9	Rendora	1,993
10	cariddi	1,351
11	go-dork	975
12	till	807
13	webpalm	323
14	nebula	279
15	antch	255
16	crawley	227
17	dorkscout	216
18	ChainWalker	193
19	slrp	147
20	spidy	139
21	gogetcrawl	124
22	seonaut	121
23	node-crawler	107