requests-html vs croncert-config

requests-html

Pythonic HTML Parsing for Humans™ (by kennethreitz)

croncert-config

configuration and github actions for concertcloud.live (fka croncert.ch), a website that shows you concerts in various cities (by jakopako)

concerts Crowdsourcing Scraping scraping-websites

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

requests-html		croncert-config
	Project
2	Mentions	3
266	Stars	10
-	Growth	-
0.0	Activity	9.3
almost 2 years ago	Latest Commit	5 days ago
	Language
MIT License	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

requests-html

Posts with mentions or reviews of requests-html. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-05-23.

Which string to lower case method to you use?
2 projects | /r/Python | 23 May 2022

Example: requests-html which has a rather exhaustive README.md, but their dedicated page is not that helpful, if I remember correctly, and currently the domain is suspended.
Problem reaching a link hidden deeply in the html
1 project | /r/webscraping | 14 Jun 2021

You can get through this by using requests_html to render the full page before trying to reach this url (Selenium works too but is even heavier).

croncert-config

Posts with mentions or reviews of croncert-config. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-09.

No code command line webscraper
3 projects | /r/webscraping | 9 Mar 2023

I actually started this scraping project because of an idea I wanted to try, which is scraping concert data from as many websites as possible with as little effort as possible, see https://github.com/jakopako/croncert-config This seems to work better and better. Still I am wondering if there are any other valid use cases for such a terminal based scraper or if it's rather niche. What do you think?
Crowdsourced concert scraping project
2 projects | /r/webscraping | 17 May 2022

I am currently working on a configurable command line webscraper, called goskyr and my first use case is collecting as much concert data as possible for this website idea I had, croncert.ch I am hoping that people other than me are willing to contribute to the scraper configuration file in this repository, https://github.com/jakopako/croncert-config, which also contains a github action to regularly run the scraper. What do you think? Could this work? How should I spread the word?
New concert website
1 project | /r/Music | 16 Apr 2022

croncert.ch is a website that lists concerts worldwide (currently, ‘worldwide’ is more of a euphemism), focussing on smaller venues. An automated process regularly scrapes the underlying concert data. The idea is that anyone can contribute by extending the scraper configuration with new concert venues. Feel free to check out https://github.com/jakopako/croncert-config for more details!

What are some alternatives?

When comparing requests-html and croncert-config you can also consider the following projects:

requests-html - Pythonic HTML Parsing for Humans™

goskyr - A configurable command-line web scraper written in go with auto configuration capability

html2rss - 📰 Build RSS 2.0 feeds from websites (and JSON APIs) with a few CSS selectors.

fitter - New way for collect information from the API's/Websites

open-dictionary - 🦄 An initiative to create a dictionary which is free for everyone 🚀

Ferret - Declarative web scraping

osdg-data - The OSDG Community Dataset (OSDG-CD) is a public dataset of thousands of text excerpts, validated by OSDG Community Platform (OSDG-CP) citizen scientists with respect to the Sustainable Development Goals (SDGs). The dataset is updated every quarter and published on Zenodo.

Crawly - Crawly, a high-level web crawling & scraping framework for Elixir.