Go Crawler

Open-source Go projects categorized as Crawler

Top 23 Go Crawler Projects

  • lux

    👾 Fast and simple video download library and CLI tool written in Go

    Project mention: Bilibili download stalls at around 30-60% | /r/youtubedl | 2023-05-18

    Not a fix, but I tend to use lux when downloading from bilibili. It is faster too.

  • colly

    Elegant Scraper and Crawler Framework for Golang

    Project mention: New modern web crawling tool | news.ycombinator.com | 2023-04-30

    Sounds cool, but how is this different from Colly: https://github.com/gocolly/colly?

  • SonarQube

    Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.

  • crawlab

    Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架

    Project mention: Self-hosted web scraper? | /r/selfhosted | 2023-01-03

    Haven't tried but this project https://github.com/crawlab-team/crawlab looks promising.

  • Pholcus

    Pholcus is a distributed high-concurrency crawler software written in pure golang

  • katana

    A next-generation crawling and spidering framework.

    Project mention: Originally a Covid project. Now a discount search engine. | /r/nextjs | 2023-02-05

    Using a few different methods. Pulling the sites I'm using Puppeteer and Katana (https://github.com/projectdiscovery/katana). To process and extract the information is tricky, most websites selling things put time into their metadata; this does make it easier. Additionally, a lot of the larger stores have common patterns between them. Failing all of this, I trained a Tensor flow model to understand how to read product pages. However, it's far from perfect and a journey of continual improvement.

  • Ferret

    Declarative web scraping

  • Rendora

    dynamic server-side rendering using headless Chrome to effortlessly solve the SEO problem for modern javascript websites

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • Geziyor

    Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

  • cariddi

    Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more

    Project mention: cariddi v1.3.1 is out🥳 | /r/opensource | 2023-03-24

    cariddi is an open source (https://github.com/edoardottt/cariddi) web security tool. It takes as input a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more.

  • go-dork

    The fastest dork scanner written in Go.

  • till

    DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.

  • antch

    Antch, a fast, powerful and extensible web crawling & scraping framework for Go

  • dorkscout

    DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets

    Project mention: Automatizovani Google Dorking | /r/programiranje | 2023-04-14
  • ChainWalker

    Rapid Smart Contract Crawler

    Project mention: Chain Walker - Smart Contract (RCP/IPC) Crawler 👻🧛‍♂️ | /r/netsec | 2022-06-18
  • crawley

    The unix-way web crawler (by s0rg)

    Project mention: github/crawley v1.5.0 released | /r/golang | 2022-10-08

    crawley project: https://github.com/s0rg/crawley

  • spidy

    Domain names collector - Crawl websites and collect domain names along with their availability status. (by twiny)

    Project mention: Share Your Code.. Share your most unique piece of Go code. | /r/golang | 2022-10-15

    1 - Expired domain scrapper => https://github.com/twiny/spidy 2 - A sample & efficient web crawler => https://github.com/twiny/wbot 3 - A mini blockchain scanner => https://github.com/twiny/blockscan 4 - A Snake Game => https://github.com/twiny/snaky

  • pagser

    Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler

  • slrp

    rotating open proxy multiplexer

    Project mention: SLRP – rotating open proxy multiplexer | news.ycombinator.com | 2022-07-12
  • bathyscaphe

    Fast, highly configurable, cloud native dark web crawler.

  • skweez

    Fast website scraper and wordlist generator

  • google-search-results-golang

    Google Search Results GoLang API

  • seonaut

    Open source SEO auditing tool.

  • webpalm

    WebPalm is a powerful command-line tool for website mapping and web scraping. With its recursive approach, it can generate a complete tree of all webpages and their links on a website. It can also extract data from the body of each page using regular expressions, making it an ideal tool for web scraping and data extraction.

    Project mention: webpalm | /r/redteamsec | 2023-06-05
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-06-05.

Go Crawler related posts

Index

What are some of the best open-source Crawler projects in Go? This list will help you:

Project Stars
1 lux 21,187
2 colly 19,693
3 crawlab 9,857
4 Pholcus 7,392
5 katana 6,578
6 Ferret 5,393
7 Rendora 1,962
8 Geziyor 1,933
9 cariddi 919
10 go-dork 799
11 till 799
12 antch 248
13 dorkscout 199
14 ChainWalker 166
15 crawley 143
16 spidy 116
17 pagser 82
18 slrp 82
19 bathyscaphe 82
20 skweez 56
21 google-search-results-golang 46
22 seonaut 40
23 webpalm 38
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com