Top 23 Go Scraper Projects

lux

13 25,312 7.4 Go

👾 Fast and simple video download library and CLI tool written in Go

Project mention: Bilibili download stalls at around 30-60% | /r/youtubedl | 2023-05-18

Not a fix, but I tend to use lux when downloading from bilibili. It is faster too.

colly

39 22,205 5.7 Go

Elegant Scraper and Crawler Framework for Golang

Project mention: Scraping the full snippet from Google search result | dev.to | 2024-01-01

SerpApi focuses on scraping search results. That's why we need extra help to scrape individual sites. We'll use GoColly package.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Ferret

0 5,620 1.8 Go

Declarative web scraping
rod

20 4,808 7.9 Go

A Devtools driver for web automation and scraping

Project mention: Need help authenticating to Okta programatically. | /r/okta | 2023-07-03

I have tried the following. 1. Login to Okta via browser programatically using go-rod. Which I managed to do so successfully, but I'm failing to load up Slack as it's stuck in the browser loader screen for Slack. 2. I tried to authenticate via Okta RESTful API. So far, I have managed to authenticate using {{domain}}/api/v1/authn, and then subsequently using MFA via the verify endpoint {{domain}}/api/v1/authn/factors/{{factorID}}/verify which returns me a sessionToken. From here, I can successfully create a sessionCookie which have proven quite useless to me. Perhaps I am doing it wrongly.

Geziyor

2 2,480 0.6 Go

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

Project mention: Show HN: I scraped 25M Shopify products to build a search engine | news.ycombinator.com | 2023-12-13

As someone who has scraped millions of items myself, I had success using Geziyor (https://github.com/geziyor/geziyor) built in Go. Shopify sites are especially easy to scrape because they tend to share the same product data formatting and don't hide it behind JS rendering.

cariddi

7 1,360 7.6 Go

Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
mangal

5 1,176 0.0 Go

📖 The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!

Project mention: What application handles manga downloads? | /r/selfhosted | 2023-05-19

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
till

5 807 1.8 Go

DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
finance-go

29 686 5.0 Go

:bar_chart: Financial markets data library implemented in go.

Project mention: finance-go: NEW Data - star count:602.0 | /r/algoprojects | 2023-05-13

Dataflow kit

0 636 0.0 Go

Extract structured data from web sites. Web sites scraping.
ant

4 276 0.0 Go

A web crawler for Go (by yields)
GMDB

1 234 0.0 Go

GMDB is the ultra-simple, cross-platform Movie Library with Features (Search, Take Note, Watch Later, Like, Import, Learn, Instantly Torrent Magnet Watch)
dorkscout

14 216 0.0 Go

DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets
demeter

2 174 0.0 Go

Demeter is a tool for scraping the calibre web ui
meteor

1 171 6.7 Go

Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog. (by raystack)
JsonGenius

12 156 6.1 Go

Get structured JSON data from any page.

Project mention: Show HN: SingleAPI – Convert the Internet into your own API | news.ycombinator.com | 2023-10-17

isn’t this just using jsongenius[1]
[1] https://github.com/semanser/JsonGenius

spidy

4 142 0.0 Go

Domain names collector - Crawl websites and collect domain names along with their availability status. (by twiny)
scraply

1 126 0.0 Go

Scraply a simple dom scraper to fetch information from any html based website
fitter

15 98 8.9 Go

New way for collect information from the API's/Websites (by PxyUp)

Project mention: Show HN: Fitter – configurable open-source scraper | news.ycombinator.com | 2024-01-14

rrip

6 70 5.8 Go

Bulk image downloader for reddit.

Project mention: rrip v0.5 - Go template filters / formatting, GNU style long options | /r/DataHoarder | 2023-08-03

ultimate-guitar-scraper

1 69 4.8 Go

A simple scraper for Ultimate-Guitar.com's mobile API, written in Go. (by Pilfer)

Project mention: Freetar – an alternative front end for ultimate-guitar.com | news.ycombinator.com | 2023-11-29

I added a PR[0] to a CLI project doing exactly this for all your saved tabs. Could probably be extended for all tabs on the site as well.
[0]: https://github.com/Pilfer/ultimate-guitar-scraper/pull/2

reX

1 63 6.6 Go

Reverse Engineered Twitter's API (by zmovane)

Project mention: GitHub - Amovane/reX: Reverse Engineered Twitter's API: Since twitter dev removed the API for accessing user followers and following, developers have found it difficult to obtain this data. Here, I'm sharing my reverse engineering solution | /r/bag_o_news | 2023-09-18

grab

7 63 1.4 Go

Configurable Scraper & Downloader, Powered by RegExp and Go (by everdrone)
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Go Scraper related posts

Show HN: Fitter – configurable open-source scraper

1 project | news.ycombinator.com | 14 Jan 2024
Scraping the full snippet from Google search result

3 projects | dev.to | 1 Jan 2024
Show HN: Flyscrape – A standalone and scriptable web scraper in Go

6 projects | news.ycombinator.com | 11 Nov 2023
Colly: Elegant Scraper and Crawler Framework for Golang

1 project | news.ycombinator.com | 23 Aug 2023
rrip v0.5 - Go template filters / formatting, GNU style long options

1 project | /r/DataHoarder | 3 Aug 2023
PxyUp/fitter: New way for collect information from the API's/Websites

1 project | /r/golang | 3 Jul 2023
Show HN: Fitter – next generation web-scraper

1 project | news.ycombinator.com | 28 Jun 2023
A note from our sponsor - SaaSHub
www.saashub.com | 7 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Scraper projects in Go? This list will help you:

	Project	Stars
1	lux	25,312
2	colly	22,205
3	Ferret	5,620
4	rod	4,808
5	Geziyor	2,480
6	cariddi	1,360
7	mangal	1,176
8	till	807
9	finance-go	686
10	Dataflow kit	636
11	ant	276
12	GMDB	234
13	dorkscout	216
14	demeter	174
15	meteor	171
16	JsonGenius	156
17	spidy	142
18	scraply	126
19	fitter	98
20	rrip	70
21	ultimate-guitar-scraper	69
22	reX	63
23	grab	63

Go Scraper

Top 23 Go Scraper Projects

Go Scraper related posts

Show HN: Fitter – configurable open-source scraper

Scraping the full snippet from Google search result

Show HN: Flyscrape – A standalone and scriptable web scraper in Go

Colly: Elegant Scraper and Crawler Framework for Golang

rrip v0.5 - Go template filters / formatting, GNU style long options

PxyUp/fitter: New way for collect information from the API's/Websites

Show HN: Fitter – next generation web-scraper

Index