is there a way to take "snapshots" of every page of a website?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

fetchurls

4 123 0.0 Shell

A bash script to spider a site, follow links, and fetch urls (with built-in filtering) into a generated text file.
Yacy

115 3,244 8.7 Java

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance

I happen to already run a YaCy node, and it's decent at crawling things sometimes. Then the list could be fed into ArchiveBox.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
ArchiveBox

248 19,672 9.7 Python

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

I've been playing around with ArchiveBox. It offers several different options for storing things.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project