go-cloudflare-scraper
CycleTLS
Our great sponsors
go-cloudflare-scraper | CycleTLS | |
---|---|---|
2 | 3 | |
126 | 786 | |
0.8% | - | |
0.0 | 6.7 | |
over 1 year ago | 17 days ago | |
Go | Go | |
- | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
go-cloudflare-scraper
CycleTLS
-
Is it possible to scrape a website protected by Cloudflare?
A lot of websites nowadays add fingerprint checking. So even if you fake the headers - it wouldn't help, as cloudflare still knows you are making request from go / python / whatever. So there is a lib to spoof fingerptint - https://github.com/Danny-Dasilva/CycleTLS . It may work for you .
- Curl’s TLS Fingerprint
-
Stack under attack: what we learned about handling DDoS attacks
While it can still be spoofed using several packages , e.g. https://github.com/Danny-Dasilva/CycleTLS it can still provide a meaningful/easy-to-manipulate signal.
Moreover, most bots conducting L7 DDoS don't use real/headless browsers in order to be able to scale their attack, so it's highly likely they'll have a discriminating/inconsistent TLS fingerprint.
This can also be done directly in Fastly using e.g. https://developer.fastly.com/reference/vcl/variables/client-...
Another approach to proactively flag malicious IPs is to scrape free proxies. Indeed, most DDoS leverage lot of cheap/know bad IPs. It's frequent to see these free proxies in these attacks.
What are some alternatives?
utls - Fork of the Go standard TLS library, providing low-level access to the ClientHello for mimicry purposes.
gost - GO Simple Tunnel - a simple tunnel written in golang
chromedp - A faster, simpler way to drive browsers supporting the Chrome DevTools Protocol.
mimic - Mimic chromium's HTTP/HTTP2 and TLS implementations.
surf - Stateful programmatic web browsing in Go.
phantomgo - a headless browser phantomjs for golang
Ponzu - Headless CMS with automatic JSON API. Featuring auto-HTTPS from Let's Encrypt, HTTP/2 Server Push, and flexible server framework written in Go.
colly - Elegant Scraper and Crawler Framework for Golang