Ask HN: What compression doesn't re-include the same file multiple times?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • linux-timemachine

    Rsync-based OSX-like time machine for Linux, MacOS and BSD for atomic and resumable local and remote backups

  • > I am concerned about the longevity of my archives

    If you're concerned about archival longevity, and I for one certainly are, then maybe consider not to compress at all. Both compression and encryption add to obscurity and loss of redundancy in the backup. Using a widely understood file system and a very obvious arrangement of the data (for me that means: directories with dates, below a tree of files that mimics their original locations) will be a huge plus should the data have to be recovered at some point in the future.

    Personally I am using a slightly adapted version of https://github.com/cytopia/linux-timemachine for this task. You do get de-duplication for the file transfer, but each file is written as it was on the target. You'll get a timestamped directory for each time a backup was run. Like MacOS' timemachine, the script uses hard links to de-duplicate identical files across different timestamped directories so the overall space requirement for that incremental backup you did an hour or a day later can be very small.

    I can certify that this setup, while it does not occupy the least conceivable amount of storage area, is very amenable to be searched and trivial to use for recovery. much better in this regard than any kind of compressed archive format which are always a pain in terms of searchability and so on.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • What's the simplest way to take a snapshot of your server

    1 project | /r/HomeServer | 22 Dec 2022
  • Best practices for backups

    1 project | /r/linuxquestions | 12 Apr 2022
  • Tumbleweed without btrfs/snapper?

    1 project | /r/openSUSE | 23 Dec 2021
  • cytopia/linux-timemachine - Rsync-based OSX-like time machine for Linux, MacOS and BSD for atomic and resumable local and remote backups

    1 project | /r/bag_o_news | 31 Jan 2021
  • rsync based linux timemachine clone - now with full remote support

    1 project | /r/commandline | 30 Jan 2021