wi-page
curl-impersonate
wi-page | curl-impersonate | |
---|---|---|
2 | 31 | |
1 | 3,337 | |
- | - | |
0.0 | 7.1 | |
about 3 years ago | about 2 months ago | |
Python | Python | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
wi-page
-
Ask HN: What are the best tools for web scraping in 2022?
[4] https://github.com/altilunium/wi-page (Scrap wikipedia to get most active contributors that contribute to a certain article)
- Show HN: Wi-Page – Rank Wikipedia Article's Contributors by Byte Counts
curl-impersonate
-
Recent 'MFA Bombing' Attacks Targeting Apple Users
> us[e] Akamai to block scraping
Would https://github.com/lwthiker/curl-impersonate help? Haven’t tried with Akamai, but did help with another widely used CDN that shall remain unnamed (but has successfully infused me with burning hate for their products after a couple of years’ worth of using an always-on VPN to bypass Internet censorship and/or a slightly unusual browser).
- Curl-impersonate: Mimic real browsers' TLS handshake with curl
-
Get RSS feed for your Ko-Fi account
But before that, I had to create a development environment where I could do the coding. I used Docker and created a docker-compose.yml file on my local system to build a container. At first, I did that on an Arm based computer and the first problem appeared. Although RSS-Bridge was working fine, I couldn't get any data, and the reason was that Ko-Fi.com uses Cloudflare CDN. This is something that a lot of people had issues with in the past. RSS-Bridge solves that problem by using a special build of curl that can impersonate the four major browsers: Chrome, Edge, Safari & Firefox. But unfortunately, that library doesn't work well on Arm-based systems, so I had to move to my trusty Intel-based Linux computer.
-
curl-impersonate VS curl-impersonate-php - a user suggested alternative
2 projects | 2 Aug 2023
-
Found a way to bypass Cloudflare 403 forbidden in cURL, fetch
Curl-Impersonate: https://github.com/lwthiker/curl-impersonate A special build of curl that can impersonate Chrome & Firefox
- Weird API behavior: Only Postman and browser consistently work but making same request with requests library gets a Captcha instead.
-
Web fingerprinting is worse than I thought
I haven’t seen a custom build of Wget, but for Curl there is curl-impersonate[1].
[1] https://github.com/lwthiker/curl-impersonate
- Using selenium with proxy still hit bot detection
- Devirtualizing Nike.com's Bot Protection (Part 1)
-
Bypassing University Internet Restrictions for Legal Purposes (to access my homeservers/raspberry Pis/VPS)
If you're just trying to pull a file, the curl-impersonate could be a low-effort option.
What are some alternatives?
kiwix-hotspot - Kiwix Hotspot Image Creator (Desktop) for Windows/macOS/Linux
curl_cffi - Python binding for curl-impersonate via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
estela - estela, an elastic web scraping cluster 🕸
challenge-bypass-extension - DEPRECATED - Client for Privacy Pass protocol providing unlinkable cryptographic tokens
danker - Compute PageRank on >3 billion Wikipedia links on off-the-shelf hardware.
puppeteer - Node.js API for Chrome
scrapy-redis - Redis-based components for Scrapy.
SendWhatsppTextByJavaScript - Here is small JS Script for sending a message in a loop.
polite - Be nice on the web
static-curl - fully static builds of curl, runs anywhere
chrome-aws-lambda - Chromium Binary for AWS Lambda and Google Cloud Functions
browsercookie