An easy solution to save entire websites for my dad?

This page summarizes the projects mentioned and recommended in the original post on /r/DataHoarder

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • ArchiveBox

    🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

  • Run a small server with some storage at his house (or yours with it public facing) that runs ArchiveBox. It's basically a locally hosted Archive.org clone. https://github.com/ArchiveBox/ArchiveBox

  • replayweb.page

    Serverless replay of web archives directly in the browser

  • github.com/Archiveteam/grab-site is quite simple, and you could probably easily whip up a script. It does use WARC, but there's a very good site called https://replayweb.page that renders most pages well ... catch is, grab-site doesn't run JavaScript, so sites that require JS to load the images will probably not get the images.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • archivy

    Archivy is a self-hostable knowledge repository that allows you to learn and retain information in your own personal and extensible wiki.

  • Archivy might be to your liking!

  • savepagenow

    A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service

  • You can use Firefox to download Web pages in HTML easily, just press "F10" and a menu should appear on the top, then click on "File" and then "save as" to save it where you want. Although this doesn't do crawling, because it's very quick, you could save each link manualy. An other option which does crawling would be to save the pages in the wayback machine, although it doesn't save the pages in your computer, it makes them available for everyone to see.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts