Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 webscraper Open-Source Projects
-
xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
rightmove_webscraper.py
Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
-
CoWin-Vaccine-Notifier
Automated Python Script to retrieve vaccine slots availability and get notified when a slot is available.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
otakuapuri
Otakuapuri is a manga downloader and anime streaming application that provides an easy and convenient platform for manga and anime enthusiasts. Users can download their favorite manga in PDF format and stream their favorite anime series.
-
raspberry-pi-stock-checker
A configurable python webscraper that checks raspberry pi stocks from verified sellers
-
YellowPage-scraper
A YellowPage scraper is a Python program/script that extracts data from the YellowPages.com website using the Python programming language. The scraper can be used to gather information such as business names, addresses, phone numbers, emails and reviews from the YellowPages website.
-
ti_scraper
Highly configurable scripts for a web scraper intended to be used for cyber threat intelligence
-
manga2pdf
Simple Ruby script to download manga and merge the images into a single pdf file. Available with both CLI and GUI.
-
PotParser
Python package which allows you to scrape information about cannabis strains and calculate the amount of THC or CBD in a given amount of flower
-
tailwind-starter
this is my gulp starter template for tailwind that implements rtl support, jit mode, tree-shaking, dart-sass mixins and functions, es6 helper functions, and more out of the box
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
You could try Xidel[1]. It supports JSON, XML and HTML using XPath/XQuery 3.1
It has some extensions to the standard that are pretty nice (JSONiq, CSS selectors, html “template” matching), but you can limit it to just standard XPath/XQuery if you like.
I recommend getting the nightly v .99 build if you give it a try, the stable .98 version is pretty old and I’ve had no issues with .99
1. https://www.videlibri.de/xidel.html
Hi. I am looking to host a private business directory for an community of entrepreneurs, similar to www.yellowpages.com. Private as in protected by a pin or something. Got any suggestions?
because I'm not a developer, I took this project https://github.com/sandra-liedtke/ti_scraper to help me.
Project mention: PotParser - a cli tool for getting information's about a strain from different websites | /r/trees | 2023-05-12
webscraper related posts
-
Private business directory website
-
How do you get girl clothes in secret
-
I have made a simple webscraper in python.pls checkout this github project.
-
Writing a simple Web crawler in python
-
Wrote an article on medium above webscraping in python
-
Wrote a Simple webcrawler in python
-
FULL GUIDE FOR EDGENUITY
-
A note from our sponsor - InfluxDB
www.influxdata.com | 2 May 2024
Index
What are some of the best open-source webscraper projects? This list will help you:
Project | Stars | |
---|---|---|
1 | soup | 2,128 |
2 | xidel | 652 |
3 | Rcrawler | 344 |
4 | rightmove_webscraper.py | 236 |
5 | Stocker | 149 |
6 | crypto | 141 |
7 | CoWin-Vaccine-Notifier | 107 |
8 | iSubRip | 92 |
9 | Jobs_LinkedIn | 62 |
10 | SearchifyX | 58 |
11 | scraperx | 53 |
12 | letterboxdpy | 31 |
13 | NoFbEventScraper | 26 |
14 | otakuapuri | 18 |
15 | raspberry-pi-stock-checker | 13 |
16 | YellowPage-scraper | 7 |
17 | hes-dead-jim | 5 |
18 | kicktipp-bot | 5 |
19 | ti_scraper | 5 |
20 | File-Engine | 4 |
21 | manga2pdf | 4 |
22 | PotParser | 4 |
23 | tailwind-starter | 3 |
Sponsored