Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev. Learn more →
Top 12 Go Spider Projects
Elegant Scraper and Crawler Framework for GolangProject mention: Show HN: Flyscrape – A standalone and scriptable web scraper in Go | news.ycombinator.com | 2023-11-11
Interesting. Can you compare it to colly? 
Last time I looked it was the most popular choice for scraping in Go and I have some projects using it.
Is it similar? Does it have more/less features or is it more suited for a different use case? (Which one?)
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架Project mention: Self-hosted web scraper? | /r/selfhosted | 2023-01-03
Haven't tried but this project https://github.com/crawlab-team/crawlab looks promising.
Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev.
Pholcus is a distributed high-concurrency crawler software written in pure golang
BitTorrent DHT Protocol && DHT Spider.
Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.Project mention: Show HN: Flyscrape – A standalone and scriptable web scraper in Go | news.ycombinator.com | 2023-11-11
Its been 8+ years since i started scraping. I even wrote a popular Go web scraping framework previously: (https://github.com/geziyor/geziyor).
These days, I'm not even using Go for scraping, as the webpage changes makes me crazy, so I moved to Typescript+Playwright. (Crawlee framework is cool, while not strictly necessary).
My favorite stack as of 2023: TypeScript+Playwright+Crawlee(Optional)
Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and moreProject mention: cariddi v1.3.1 is out🥳 | /r/opensource | 2023-03-24
cariddi is an open source (https://github.com/edoardottt/cariddi) web security tool. It takes as input a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more.
WebPalm is a powerful command-line tool for website mapping and web scraping. With its recursive approach, it can generate a complete tree of all webpages and their links on a website. It can also extract data from the body of each page using regular expressions, making it an ideal tool for web scraping and data extraction.Project mention: New webcrawler for bug-hunters and data-miners | news.ycombinator.com | 2023-10-18
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
A web crawler for Go (by yields)
⚡ Light weight Golang spider framework | 轻量的 Golang 爬虫框架
Domain names collector - Crawl websites and collect domain names along with their availability status. (by twiny)
Golang Crawling and scraping framework (by gosom)Project mention: colly VS scrapemate - a user suggested alternative | libhunt.com/r/colly | 2023-04-15
Recursive hostnames crawlerProject mention: Event Horizon | /r/candeltreow | 2023-03-05
Event Horizon is a project that tells about interesting and safe places of the darknet space, with the aim of destroying stereotypes established in society, about all the immoral horrors of the dark web. It has many interesting sites with attached screenshots, some description and sometimes files. The project also has its own telegram bot for screenshots of onion resources and an onion crawler for searching for them - whose sources can be found on Github. Horizon was previously removed and resumed last year, now the activity of publications has fallen, presumably the project is on pause. Links: Telegram, Telegram-bot, Tor-Crawler.
Go Spider related posts
New modern web crawling tool
2 projects | news.ycombinator.com | 30 Apr 2023
DHT VS dht - a user suggested alternative
2 projects | 13 Jan 2022
Show HN: A fast, feature-rich crawler for Go
1 project | news.ycombinator.com | 26 Mar 2021
- Fast, feature-rich web crawler for Go
- Feature rich crawler for Go.
- Create a tiny crawler/scraper for Go
A note from our sponsor - Onboard AI
getonboard.dev | 28 Nov 2023
What are some of the best open-source Spider projects in Go? This list will help you: