webscraper

Open-source projects categorized as webscraper

Top 23 webscraper Open-Source Projects

  • soup

    Web Scraper in Go, similar to BeautifulSoup

  • xidel

    Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

  • Project mention: Move over jq I found something easier: fx | news.ycombinator.com | 2023-06-06

    You could try Xidel[1]. It supports JSON, XML and HTML using XPath/XQuery 3.1

    It has some extensions to the standard that are pretty nice (JSONiq, CSS selectors, html “template” matching), but you can limit it to just standard XPath/XQuery if you like.

    I recommend getting the nightly v .99 build if you give it a try, the stable .98 version is pretty old and I’ve had no issues with .99

    1. https://www.videlibri.de/xidel.html

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Rcrawler

    An R web crawler and scraper

  • rightmove_webscraper.py

    Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object

  • Stocker

    Financial Web Scraper & Sentiment Classifier (by dwallach1)

  • crypto

    Cryptocurrency Historical Market Data R Package (by JesseVent)

  • CoWin-Vaccine-Notifier

    Automated Python Script to retrieve vaccine slots availability and get notified when a slot is available.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • iSubRip

    A Python package for scraping and downloading subtitles from AppleTV / iTunes movie pages.

  • Jobs_LinkedIn

    Finds Jobs on LinkedIn using web-scraping

  • SearchifyX

    Fast flashcard searcher study tool

  • scraperx

    Library for scraping websites or apis at any scale

  • letterboxdpy

    A letterboxd webscraper

  • NoFbEventScraper

    This app scrapes Facebook event links and adds the event to your calendar.

  • otakuapuri

    Otakuapuri is a manga downloader and anime streaming application that provides an easy and convenient platform for manga and anime enthusiasts. Users can download their favorite manga in PDF format and stream their favorite anime series.

  • raspberry-pi-stock-checker

    A configurable python webscraper that checks raspberry pi stocks from verified sellers

  • YellowPage-scraper

    A YellowPage scraper is a Python program/script that extracts data from the YellowPages.com website using the Python programming language. The scraper can be used to gather information such as business names, addresses, phone numbers, emails and reviews from the YellowPages website.

  • Project mention: Private business directory website | /r/selfhosted | 2023-12-03

    Hi. I am looking to host a private business directory for an community of entrepreneurs, similar to www.yellowpages.com. Private as in protected by a pin or something. Got any suggestions?

  • hes-dead-jim

    A command-line tool for finding and reporting dead/broken links in a file or webpage.

  • kicktipp-bot

    A bot which can submit tips for a Kicktipp competition based on quotes.

  • ti_scraper

    Highly configurable scripts for a web scraper intended to be used for cyber threat intelligence

  • Project mention: Adding Proxy to existing Scraper | /r/webscraping | 2023-11-04

    because I'm not a developer, I took this project https://github.com/sandra-liedtke/ti_scraper to help me.

  • File-Engine

  • Project mention: A web scraper to extract files from google | /r/coolgithubprojects | 2023-08-04
  • manga2pdf

    Simple Ruby script to download manga and merge the images into a single pdf file. Available with both CLI and GUI.

  • PotParser

    Python package which allows you to scrape information about cannabis strains and calculate the amount of THC or CBD in a given amount of flower

  • Project mention: PotParser - a cli tool for getting information's about a strain from different websites | /r/trees | 2023-05-12
  • tailwind-starter

    this is my gulp starter template for tailwind that implements rtl support, jit mode, tree-shaking, dart-sass mixins and functions, es6 helper functions, and more out of the box

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

webscraper related posts

  • Private business directory website

    1 project | /r/selfhosted | 3 Dec 2023
  • How do you get girl clothes in secret

    1 project | /r/MtF | 12 Jul 2023
  • I have made a simple webscraper in python.pls checkout this github project.

    1 project | /r/madeinpython | 23 May 2023
  • Writing a simple Web crawler in python

    1 project | /r/u_DevGenious | 9 Apr 2023
  • Wrote an article on medium above webscraping in python

    1 project | /r/Python | 9 Apr 2023
  • Wrote a Simple webcrawler in python

    1 project | /r/PythonProjects2 | 9 Apr 2023
  • FULL GUIDE FOR EDGENUITY

    3 projects | /r/edgenuity | 2 Feb 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 2 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source webscraper projects? This list will help you:

Project Stars
1 soup 2,128
2 xidel 652
3 Rcrawler 344
4 rightmove_webscraper.py 236
5 Stocker 149
6 crypto 141
7 CoWin-Vaccine-Notifier 107
8 iSubRip 92
9 Jobs_LinkedIn 62
10 SearchifyX 58
11 scraperx 53
12 letterboxdpy 31
13 NoFbEventScraper 26
14 otakuapuri 18
15 raspberry-pi-stock-checker 13
16 YellowPage-scraper 7
17 hes-dead-jim 5
18 kicktipp-bot 5
19 ti_scraper 5
20 File-Engine 4
21 manga2pdf 4
22 PotParser 4
23 tailwind-starter 3

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com