SaaSHub helps you find the best software and product alternatives Learn more β
Top 23 Go Scraper Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
cariddi
Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
-
mangal
π The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
till
DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
-
GMDB
GMDB is the ultra-simple, cross-platform Movie Library with Features (Search, Take Note, Watch Later, Like, Import, Learn, Instantly Torrent Magnet Watch)
-
dorkscout
DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets
-
meteor
Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog. (by raystack)
-
spidy
Domain names collector - Crawl websites and collect domain names along with their availability status. (by twiny)
-
ultimate-guitar-scraper
A simple scraper for Ultimate-Guitar.com's mobile API, written in Go. (by Pilfer)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Not a fix, but I tend to use lux when downloading from bilibili. It is faster too.
SerpApi focuses on scraping search results. That's why we need extra help to scrape individual sites. We'll use GoColly package.
I have tried the following. 1. Login to Okta via browser programatically using go-rod. Which I managed to do so successfully, but I'm failing to load up Slack as it's stuck in the browser loader screen for Slack. 2. I tried to authenticate via Okta RESTful API. So far, I have managed to authenticate using {{domain}}/api/v1/authn, and then subsequently using MFA via the verify endpoint {{domain}}/api/v1/authn/factors/{{factorID}}/verify which returns me a sessionToken. From here, I can successfully create a sessionCookie which have proven quite useless to me. Perhaps I am doing it wrongly.
Project mention: Show HN: I scraped 25M Shopify products to build a search engine | news.ycombinator.com | 2023-12-13As someone who has scraped millions of items myself, I had success using Geziyor (https://github.com/geziyor/geziyor) built in Go. Shopify sites are especially easy to scrape because they tend to share the same product data formatting and don't hide it behind JS rendering.
Project mention: Show HN: SingleAPI β Convert the Internet into your own API | news.ycombinator.com | 2023-10-17isnβt this just using jsongenius[1]
[1] https://github.com/semanser/JsonGenius
Project mention: Show HN: Fitter β configurable open-source scraper | news.ycombinator.com | 2024-01-14
Project mention: rrip v0.5 - Go template filters / formatting, GNU style long options | /r/DataHoarder | 2023-08-03
Project mention: Freetar β an alternative front end for ultimate-guitar.com | news.ycombinator.com | 2023-11-29I added a PR[0] to a CLI project doing exactly this for all your saved tabs. Could probably be extended for all tabs on the site as well.
[0]: https://github.com/Pilfer/ultimate-guitar-scraper/pull/2
Project mention: GitHub - Amovane/reX: Reverse Engineered Twitter's API: Since twitter dev removed the API for accessing user followers and following, developers have found it difficult to obtain this data. Here, I'm sharing my reverse engineering solution | /r/bag_o_news | 2023-09-18
Go Scraper related posts
-
Show HN: Fitter β configurable open-source scraper
-
Scraping the full snippet from Google search result
-
Show HN: Flyscrape β A standalone and scriptable web scraper in Go
-
Colly: Elegant Scraper and Crawler Framework for Golang
-
rrip v0.5 - Go template filters / formatting, GNU style long options
-
PxyUp/fitter: New way for collect information from the API's/Websites
-
Show HN: Fitter β next generation web-scraper
-
A note from our sponsor - SaaSHub
www.saashub.com | 7 May 2024
Index
What are some of the best open-source Scraper projects in Go? This list will help you:
Project | Stars | |
---|---|---|
1 | lux | 25,312 |
2 | colly | 22,205 |
3 | Ferret | 5,620 |
4 | rod | 4,808 |
5 | Geziyor | 2,480 |
6 | cariddi | 1,360 |
7 | mangal | 1,176 |
8 | till | 807 |
9 | finance-go | 686 |
10 | Dataflow kit | 636 |
11 | ant | 276 |
12 | GMDB | 234 |
13 | dorkscout | 216 |
14 | demeter | 174 |
15 | meteor | 171 |
16 | JsonGenius | 156 |
17 | spidy | 142 |
18 | scraply | 126 |
19 | fitter | 98 |
20 | rrip | 70 |
21 | ultimate-guitar-scraper | 69 |
22 | reX | 63 |
23 | grab | 63 |
Sponsored