A almost perfect rsync over SSH backup script

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • restic

    Fast, secure, efficient backup program

  • Rsnapshot

    a tool for backing up your data using rsync (if you want to get help, use https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss)

  • rsnapshot[1] is what I used on FreeBSD with a snapshot. It is like a well-tested version of the author's rsync and ssh blog post. I have a blog post here[2] describing my setup. It saved my bacon multiple times.

    [1] https://github.com/rsnapshot/rsnapshot

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • synbak

    Synbak - Universal Backup System

  • My go-to backup solution, that can also manage rsync over ssh (among plenty other things), is synbak[0]. Short wrapper script to automatically mount the backup medium and it's really simple to (automatically) run a backup job. I use encrypted (LUKS) USB drives for that at home. Highly recommend it.

    For encrypted backups to e.g. a NAS, Duplicity[1] is my go-to choice (full backup every month or so, with incremental backup every day inbetween).

    [0]: https://github.com/ugoviti/synbak

  • tummy-backup

    Disc-to-disc backup system using ZFS for deduplication and efficient storage.

  • I've been doing rsync-based backups of close to a thousand systems for ~20 years, most notably for a long time I backed up the python.org infrastructure, and I have quite a few thoughts on this. I also have a battle-tested rsync wrapper that I'll point to below.

    - Backups should be automatic, only requiring attention when it is needed. This script philosophy seems to be "Just do your best, mail a log file, and rely on the user to figure out if something didn't work". Even for home backups, this is just wrong.

    - As an example of the above: This script notes that it fails if a backup takes more than 24 hours.

    - The "look for other rsyncs running" part of the code is an odd way of approaching locking, but for a single personal "push" backup I guess it is ok.

    I've got an rsync wrapper that has been battle tested over a couple decades and hundreds of servers here: https://github.com/linsomniac/tummy-backup/blob/master/sbin/...

    Features of it are:

    - As the filename implies, the goal is to rsync to a zfs destination, and it will take a zfs snapshot as part of this. It is easy to customize to another backup destination, I've had people report they have customized it for their own laptop backups, for example to an rsync.net destination.

    - It goes out of its way to detect when rsync has failed and log that.

    - It does do "inplace" rsyncs, which dramatically save space if you have large files that get appended to (logfiles, ZODB databases).

    - This is part of a larger system that manages the rsyncs of multiple systems, both local and remote. Things like alerting are done if a backup of a system has failed consistently for many days.

    - In the case that there are no failures, there is no e-mail sent, meaning the user only gets actionable e-mails.

    The hardlink trick only works for fairly small data sets. Issues include: Managing hard links takes a lot of overhead, especially on spinning discs. Large files being appended to use a ton of space (a 4GB file with 1K appended every day uses 128GB to store 14 dailies, 6 weeklies, and 12 monthlies). ZFS is a pretty good destination for rsync, as similar snapshots will use 4GB to store.

  • ohmycron

    Run cron jobs in a standardized environment with logs and locking

  • > # avoidng collisions with other rsync processes

    Use https://github.com/instacart/ohmycron

    > MONTHROTATE=monthrotate # use DD instead of YYMMDD

    Use https://rotate-backups.readthedocs.io/en/latest/readme.html

    > $RSYNC -avR "$SOURCE" "${RSYNCCONF[@]}"

    Just create a one-line script with the hardcoded rsync command you want to use and replace the directory to sync as a command-line argument, e.g.

      #!/bin/sh

  • kopia

    Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.

  • Perhaps Kopia will work for your use case:

    https://kopia.io/

    https://kopia.io/docs/advanced/amazon-s3/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Kopia: Open-Source, Fast and Secure Open-Source Backup Software

    20 projects | news.ycombinator.com | 15 Sep 2023
  • Ask HN: How do you do backups for personal/home server?

    6 projects | news.ycombinator.com | 10 Jun 2023
  • I am looking for open-source backup application alternatives

    2 projects | /r/opensource | 17 Jan 2023
  • Which service to backup your important files ?

    3 projects | /r/selfhosted | 19 Nov 2022
  • Duplicati: Free backup software to store encrypted backups online

    13 projects | news.ycombinator.com | 3 Nov 2022