The Internet Archive is under a DDoS attack

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
  1. bookcorpus

    Crawl BookCorpus

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. SingleFile

    Web Extension for saving a faithful copy of a complete web page in a single HTML file

    > But I don't think they're around anymore and have no idea how you could achieve similar functionality with dynamic pages anyway.

    Chromium's MHTML "Save as…" and the SingleFile WebExtension should both save copies of the rendered DOM.

    Apparently Safari has WebArchive and Mozilla had MAFF for similar use cases.

    I think WARC is supposed to save enough data about network streams for dynamic pages to work. At least on the Wayback Machine, infinite scrolling and "Load More" buttons do kinda work sometimes. You may have to load the archived pages in a browser and try to use each dynamic feature at least once, to trigger requests for needed resources.

    SingleFile: https://github.com/gildas-lormeau/SingleFile

    LWN on WARC, tools: https://anarc.at/blog/2018-10-04-archiving-web-sites/

    Self-hostable web archives: https://awesome-selfhosted.net/tags/archiving-and-digital-pr...

    Wayback Machine addons, bookmarklets: https://help.archive.org/help/save-pages-in-the-wayback-mach...

  4. ArchiveBox

    🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Lost something? Search through 91.7 million files from the 80s, 90s, and 2000s

    3 projects | /r/DataHoarder | 19 Oct 2022
  • Is there a browser addon which locally archives every website I visit?

    4 projects | /r/DataHoarder | 23 Dec 2021
  • Any way to archive the wiki/megathread all at once?

    2 projects | /r/Piracy | 19 Dec 2021
  • Is there a way to make my bookmarks available offline to preserve them from future deletion?

    2 projects | /r/DataHoarder | 11 Dec 2021
  • Automatic Web Archiving?

    5 projects | /r/selfhosted | 26 Sep 2021

Did you know that Python is
the 2nd most popular programming language
based on number of references?