bbcrss
mastodon-scraping

bbcrss | mastodon-scraping | |
---|---|---|
1 | 1 | |
5 | 3 | |
- | - | |
10.0 | 0.0 | |
about 1 year ago | 3 days ago | |
XSLT | ||
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
bbcrss
-
Git scraping: track changes over time by scraping to a Git repository
I've been promoting this idea for a few years now, and I've seen an increasing number of people put it into action.
A fun way to track how people are using this is with the git-scraping topic on GitHub:
https://github.com/topics/git-scraping?o=desc&s=updated
That page orders repos tagged git-scraping by most-recently-updated, which shows which scrapers have run most recently.
As I write this, just in the last minute repos that updated include:
https://github.com/drzax/queensland-traffic-conditions
https://github.com/jasoncartwright/bbcrss
https://github.com/jackharrhy/metrobus-timetrack-history
https://github.com/outages/bchydro-outages
mastodon-scraping
-
Git scraping: track changes over time by scraping to a Git repository
Thanks for linking to the topic, that was interesting
As a heads up to anyone trying this stunt, please be mindful that git-diff is ultimately a line oriented action (yeah, yeah, "git stores snapshots")
For example https://github.com/pmc-ss/mastodon-scraping/commit/2a15ce1b2... is all :fu: because git sees basically the "first line" changed
However, had the author normalized the instances.json with something like "jq -S" then one would end up with a more reasonable 1736 textual changes, which github would have almost certainly rendered
diff -u \
What are some alternatives?
gesetze-im-internet - Archive of German legal acts (weekly archive of gesetze-im-internet.de)
shot-scraper - A command-line utility for taking automated screenshots of websites
github-actions - Infromation and tips regarding GitHub Actions
hun_law_rs - Tool for parsing hungarian laws (Rust version)
mcbroken-archive - :inbox_tray: Archive for data from mcbroken.com.
queensland-traffic-conditions - A scraper that tracks changes to the published queensland traffic incidents data
Geo-IP-Database - Automatically updated tree-formatted database from MaxMind database
gh-action-data-scraping - this shows how to use github actions to do periodic data scraping
hun_law_py - Tools for parsing hungarian legal documents
torvenyek - Magyar törvények git repo
bchydro-outages - Track BCHydro Outages via Git history
