lrzip | Snebu | |
---|---|---|
6 | 10 | |
595 | 110 | |
- | - | |
3.7 | 0.0 | |
23 days ago | over 3 years ago | |
C | C | |
GNU General Public License v3.0 only | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lrzip
-
How to Get Your Backup to Half of Its Size – ZSTD Support in XtraBackup
lrzip
Long Range ZIP or LZMA RZIP
https://github.com/ckolivas/lrzip
"A compression utility that excels at compressing large files (usually > 10-50 MB). Larger files and/or more free RAM means that the utility will be able to more effectively compress your files (ie: faster / smaller size), especially if the filesize(s) exceed 100 MB. You can either choose to optimise for speed (fast compression / decompression) or size, but not both."
-
File compression
7zip and XZ are almost always the best in any comparison. (They use the same algorithm.) Occasionally something new comes allong that may be bettyer, but it fades away... Like lrzip. https://lkml.org/lkml/2011/6/4/23 https://github.com/ckolivas/lrzip
-
If we found a way to reverse a hashing function, would that make them ultra-compression algorithms?
For example lrzip has an intense "dupe hunting" mode and takes days for large content, but does compress very well once it's done (and expansion is fast). I use it on long term storage backups and disk images and junk. Completely incompatible with streaming, unlike chunk-based like gzip or deflate or etc, although unpacking can stream such as searching or verifying a tarfile archive. But the original source has to be file-based so seeking for the hunting can work across the entire file-as-a-block.
- Lrzip – Long Range Zip or LZMA RZIP
-
Ask HN: How would you store 10PB of data for your startup today?
Best I know of for that is something like lrzip still, but even then it's probably not state of the art. https://github.com/ckolivas/lrzip
It'll also take a hell of a long time to do the compression and decompression. It'd probably be better to do some kind of chunking and deduplication instead of compression itself simply because I don't think you're ever going to have enough ram to store any kind of dictionary that would effectively handle so much data. You'd also not want to have to re-read and reconstruct that dictionary to get at some random image too.
-
Encrypted Backup Shootout
There's also lrzip for large files: https://github.com/ckolivas/lrzip
Snebu
-
I'm working on a tar implementation with public key encryption extensions.
As such, I use tar for the serialization of backup data for Snebu (https://www.snebu.com), which has a plugin (tarcrypt) that operates on the data streams. Snebu ingests tar format, and emits tar format, so all you need to backup/restore a host is ssh access (server can pull backups, or client can push backups). So tarcrypt was added as way to do client-side encryption, but still be able to to submit recognizable tar files to Snebu's backend (which indexes, de-duplicates, and snapshots backups).
-
I'm giving out microgrants to open source projects for the third year in a row! Brag about your projects here so I can see them, big or small!
Snebu, on github. Simple Network Encrypting Backup Utility.
- Using Git For Backups
- Restic: Backups Done Right
- Deduplicating Archiver with Compression and Encryption
-
Backup encryption using SSH keys with age anno 2021
Details are at https://www.snebu.com/tarcrypt.html if you want to look it over (and tarcypt is part of the Snebu project https://github.com/derekp7/snebu). I'd love to get another pair of eyes on this to point out any non-obvious security limitations.
-
Interview with CEO of rsync.net: “no firewalls and no routers”
Since I've had a handful of users ask about cloud storage for Snebu, Would you be interested in adding Snebu as a supported protocol? It should be similar to how you currently support Borg. For Snebu, the client runs find and tar, sending results via ssh to the snebu binary on the remote host. And more recently client-side public key encryption support has been added via a client-side filter called "tarcrypt". Ideally, a customer would use Snebu to back up to a local device on their network (for example a Raspberry Pi with a large USB drive attached), and then use Snebu's efficient replication to send deltas to the cloud-hosted server. Client files are stored individually (deduplicated) on the Snebu server, and metadata is in an SQLite DB (advantages over Borg is more open standards for the data storage and public-key encryption, disadvantage is file-level instead of block-level deduplication and a project that isn't as widely used).
If you are interested, I would be more then happy to have an extended discussion with you going over implementation options, and updating the client side script to make it work better with your service. (https://www.snebu.com, https://github.com/derekp7/snebu, and the tarcrypt extensions to tar are described at https://www.snebu.com/tarcrypt.html).
-
Pet Project Thread February 26 2021
Would a mention of my open source backup system, Snebu (or https://github.com/derekp7/snebu) fit in this thread? Elevator pitch -- GPLv3 C code, snapshot-based, compresses, encrypts, deduplicates, can back up clients without installing an agent (just need ssh, bash, tar, and find commands on client for "pull" backups), push backups can have restricted permissions (i.e., give a client permission to push backups only, but not delete backups, or give a user restore-only permissions). Uses tar to collect the data, stores metadata in an SQLite DB on the server, files are stored in LZO format (can be read directly with lzop) (unless client-side encryption is used, but the data can still be decrypted with openssl then decompressed with lzop). Encryption is public-key based instead of needing to keep a shared symmetric key or passphrase laying around on your backup server.
-
What backup method do you use?
I created and use Snebu -- I'm working on getting it submitted to Fedora (waiting on package review now), doing daily snapshots of my fleet to a raspberry pi with external 12 TB WD Easystore drive. Provides push or pull based backups, granular access permissions, client-side public key encryption (RSA + AES-256) with HMAC validation, server-based data catalog housed in SQLite, multiple client support, global (cross client) file-level deduplication and compression. Works great for backing up a large range of OS versions since the client-side doesn't need an agent -- just bash, tar, find, and ssh.
-
Encrypted Backup Shootout
snebu (c) - https://github.com/derekp7/snebu
What are some alternatives?
bupstash - Easy and efficient encrypted backups.
UrBackup - UrBackup - Client/Server Open Source Network Backup for Windows, MacOS and Linux
rdedup - Data deduplication engine, supporting optional compression and public key encryption.
Elkarbackup - Open source backup solution for your network
duplicity - mirror of duplicity: https://code.launchpad.net/duplicity
BorgBackup - Deduplicating archiver with compression and authenticated encryption.
LeoFS - The LeoFS Storage System
restic - Fast, secure, efficient backup program
Rsnapshot - a tool for backing up your data using rsync (if you want to get help, use https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss)
ParlAI - A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Duplicati - Store securely encrypted backups in the cloud!