Python web-scraper

Open-source Python projects categorized as web-scraper

Top 20 Python web-scraper Projects

  • lightnovel-crawler

    Generate and download e-books from online sources.

  • Project mention: Help with Paperback IOS. | /r/mangapiracy | 2023-06-18

    Use Lightnovel crawler on a computer in terminal or in their discord bot to find series across multiple LN / webnovel sites then choose the format to download (epub,pdf, txt, and many more)

  • Monkey-DL (Anime Downloader)

    Bulk download your favourite anime episodes from your favourite anime websites

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • web-scraping

    Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist

  • Project mention: web-scraping: NEW Data - star count:554.0 | /r/algoprojects | 2023-09-25
  • basketball_reference_web_scraper

    NBA Stats API via Basketball Reference

  • summarizer

    A Reddit bot that summarizes news articles written in Spanish or English. It uses a custom built algorithm to rank words and sentences.

  • facebook_page_scraper

    Scrapes facebook's pages front end with no limitations & provides a feature to turn data into structured JSON or CSV

  • Senpwai

    A desktop app for tracking and batch downloading anime

  • Project mention: Building W-9 Crafter | dev.to | 2024-03-28

    It's been a cool learning experience making a Product Hunt listing, a small demo video, and allll the social posts (Twitter, LinkedIn, etc).

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • CobWeb-lnx

    CobWeb is a Python library for web scraping. The library consists of two classes: Spider and Scraper.

  • Project mention: Quem já contribuiu e quem já usou projectos open-source? | /r/devpt | 2023-06-30
  • reddit-bots

    A collection of Reddit bots that I use to enhance the subreddits I manage.

  • tagalog-dictionary-scraper

    Builds a Tagalog dictionary by collecting Tagalog words from tagalog.pinoydictionary.com

  • mexican-jobs-2020

    Data ETL & Analysis on thousands of job listings from the official Mexican job board (2020 edition).

  • opensubtitles-scraper

    scrape subtitles from opensubtitles.org

  • tweet-transcriber

    A Reddit bot that transcribes tweets from comments and submissions links, mirrors their images and replies back with a formatted Markdown message.

  • git-pull

    Parallelized web scraper for Github

  • Python-Web-Scraper

    An adaptive Python Web Scraper App to catch the best deals by scraping and parsing data from select E-Commerce sites.

  • Abosar

    অবসর 📚 A collection of short Bengali stories web scraped from various Bengali eMagazines and eNewspapers.

  • gli99

    Web scraper for gifcities.org

  • varieteebot

    A telegram bot that sends today's tee of some tee shops.

  • iw-scraper

    Web scraper for imovelweb listings

  • nanoscrape

    Simple scraping program that can download webpages, Discord embeds, and more.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python web-scraper related posts

  • Help with Paperback IOS.

    1 project | /r/mangapiracy | 18 Jun 2023
  • Multiparadigmatic Web Scraping Tool!

    1 project | /r/computerscience | 14 May 2023
  • ISSTH left me disappointed

    1 project | /r/noveltranslations | 1 May 2023
  • a discord server and bot to fetch epub chapters from novels?

    1 project | /r/MartialMemes | 16 Apr 2023
  • Python Web Scraper/Crawler for E-Commerce sites. Currently supports only a few websites but im looking to expand that list. Tips/criticism are welcomed. This is the first project for my student CV (0 working experience) so I'd like it to be as polished as possible.

    1 project | /r/programming | 1 Mar 2023
  • Wat is jullie ervaring met e-readers?

    2 projects | /r/thenetherlands | 10 Jul 2022
  • Does the kindle have a search function? A working one? I’ve seen videos but those are like years old.

    1 project | /r/kindle | 21 May 2022
  • A note from our sponsor - SaaSHub
    www.saashub.com | 18 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source web-scraper projects in Python? This list will help you:

Project Stars
1 lightnovel-crawler 1,302
2 Monkey-DL (Anime Downloader) 810
3 web-scraping 678
4 basketball_reference_web_scraper 407
5 summarizer 267
6 facebook_page_scraper 199
7 Senpwai 132
8 CobWeb-lnx 38
9 reddit-bots 23
10 tagalog-dictionary-scraper 23
11 mexican-jobs-2020 21
12 opensubtitles-scraper 20
13 tweet-transcriber 19
14 git-pull 16
15 Python-Web-Scraper 12
16 Abosar 12
17 gli99 3
18 varieteebot 3
19 iw-scraper 0
20 nanoscrape 0

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com