Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 4 webarchiving Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Project mention: Show HN: OpenAPI DevTools – Chrome ext. that generates an API spec as you browse | news.ycombinator.com | 2023-10-25https://github.com/iipc/awesome-web-archiving/blob/main/READ...
I ended up using waybackpy python module to retrieve archived URLs, it worked well. I think the feature you want for this is the "snapshots", but I didn't test this myself
NOTE:
The open source projects on this list are ordered by number of github stars.
The number of mentions indicates repo mentiontions in the last 12 Months or
since we started tracking (Dec 2020).
webarchiving related posts
-
DPReview.com is going down effective April 10.
-
DPReview.com to close on April 10 after 25 years of operation
-
This Layoff Does Not Exist: tech layoff announcements but weird
-
Alternative to HTTrack (website copier) as of 2023?
-
Software to keep Website pages "alive"?
-
How to Download All of Wikipedia onto a USB Flash Drive
-
[HELP] Starting Out for a Beginner
-
A note from our sponsor - InfluxDB
www.influxdata.com | 2 May 2024
Index
What are some of the best open-source webarchiving projects? This list will help you:
Project | Stars | |
---|---|---|
1 | awesome-web-archiving | 1,811 |
2 | waybackpy | 405 |
3 | wget-lua | 81 |
4 | cc-notebooks | 37 |
Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com