SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Scraper Projects
-
newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Douyin_TikTok_Download_API
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
-
Automatic-Udemy-Course-Enroller-GET-PAID-UDEMY-COURSES-for-FREE
Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons & enroll you for PAID UDEMY COURSES, ABSOLUTELY FREE!
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
animdl
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.
-
cinemagoer
Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb (to which we are not affiliated in any way) movie database about movies, people, characters and companies
-
Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...
-
GramAddict bot
Completely free and open-source human-like Instagram bot. Powered by UIAutomator2 and compatible with basically any Android device 5.0+ that can run Instagram - real or emulated. (by GramAddict)
-
TikTokLive
Python library to receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.
-
google-maps-scraper
👋 HOLA 👋 HOLA 👋 HOLA ! ENJOY OUR GOOGLE MAPS SCRAPER 🚀 TO EFFORTLESSLY EXTRACT DATA SUCH AS NAMES, ADDRESSES, PHONE NUMBERS, REVIEWS, WEBSITES, AND RATINGS FROM GOOGLE MAPS WITH EASE! 🤖
-
google-play-scraper
Google play scraper for Python inspired by <facundoolano/google-play-scraper> (by JoMingyu)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
At the moment I am working on a web scraper for TikTok. At the moment, I am able to retrieve data about the first 16 videos from a channel. The way I achieved this was to make requests to an unofficial API https://github.com/Evil0ctal/Douyin_TikTok_Download_API. My problem is that the requirements for this project do not allow me to use any package that would extract data from TikTok. I would like to ask you all, how should I go about this task. Already tried getting data from the HTML, but is not sufficient since most of it is not displayed when I use requests.get(URL). Could you please recommend some repositories that could help or some way of extracting the data? Thank you!
Here's what I'm trying to use: https://github.com/JustAnotherArchivist/snscrapeWhat do I need to open/run any of this? My goal with this is to extract my follower list off Twitter, and I'd very much like to know how to run it on my machine instead of having someone run it for me on theirs. I can't even figure out what I need to open the Readme file.
Project mention: Does anyone have anime websites recommendations without ads? | /r/anime | 2023-07-10If you're tech-savvy, you can try animdl, which is a command line tool. No browser, no ads. To directly stream with the default provider AllAnime: animdl stream 'Your Anime Name'
Is there a functioning tool to download the saved posts / upvotes that you do on reddit? This tool: https://github.com/shadowmoose/RedditDownloader was perfect, but it got rekt by the API changes and has been discontinued.
Project mention: Twitter api reaching rate limit. 5calls per 15 mins just to get user likes. | /r/learnprogramming | 2023-05-22hmm,, do you know any good one? I found this one but it doesn't scrape a single tweet's likes and followers https://github.com/Altimis/Scweet
Project mention: Show HN: New AI Dataset Based on LibGen and Sci-Hub | news.ycombinator.com | 2023-09-08
If they don't want you to use their API just respect their wishes and scrape Reddit. https://github.com/JosephLai241/URS it's the only moral thing we can do.
Project mention: I create a google maps scraper, let me know your thoughts | /r/webscraping | 2023-07-06My scrapers runs at 120 Listing per 10 Minutes. So yours is quite Fast. You can see my scraper at https://github.com/omkarcloud/google-maps-scraper. It is quite popular with 95 Stars.
Python Scraper related posts
- Can someone walk me through this?
- What’s the coolest things you’ve done with python?
- BDFR skipping Reddit hosted videos
- Updated Drexel Scheduler to Winter Quarter
- Show HN: New AI Dataset Based on LibGen and Sci-Hub
- Exporting a telegram chat without Telegram Desktop?
- cryptoCMD: NEW Data - star count:456.0
-
A note from our sponsor - SaaSHub
www.saashub.com | 25 Apr 2024
Index
What are some of the best open-source Scraper projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | newspaper | 13,720 |
2 | chinese-xinhua | 10,641 |
3 | Douyin_TikTok_Download_API | 6,780 |
4 | autoscraper | 5,937 |
5 | myGPTReader | 4,375 |
6 | snscrape | 4,224 |
7 | Automatic-Udemy-Course-Enroller-GET-PAID-UDEMY-COURSES-for-FREE | 3,053 |
8 | bulk-downloader-for-reddit | 2,203 |
9 | JobFunnel | 1,740 |
10 | linkedin_scraper | 1,689 |
11 | mlscraper | 1,225 |
12 | animdl | 1,201 |
13 | cinemagoer | 1,190 |
14 | RedditDownloader | 1,100 |
15 | finviz | 1,008 |
16 | Scweet | 966 |
17 | GramAddict bot | 886 |
18 | scrapyrt | 816 |
19 | bookcorpus | 778 |
20 | URS | 724 |
21 | TikTokLive | 707 |
22 | google-maps-scraper | 714 |
23 | google-play-scraper | 688 |
Sponsored