casync
duplicity
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
casync
-
We reduced conda’s index fetch bandwidth by 99%
For arbitrary state changes however, it's better to use something like casync. Note that there are a lot of tunables, implicit and explicit; for package indexing I would particularly think about "how is the index sorted" and "what is the desired chunk size".
-
Intro to Content Defined Chunking
If you just want something practical to play with, see casync. Even if it doesn't fit your workflow, or if you think you can do better, chances are you're best off building on top of it or adding patches to it, not starting from scratch.
-
Tool to clone file structure without the large files themselves?
You probably want casync.
-
A Nibble of Content-Defined Chunking - How de-duplicated, incremental file transfer works
Obligatory link to casync, which implements this better than most alternatives.
-
LibSQL – a fork of SQLite that is both Open Source, and Open Contributions
(personally, I think more people need to be aware of casync for the update storage/distribution problem. It isn't perfect for every use case, but it's good enough that you're probably better off wrapping/forking it rather than reimplementing it badly from scratch)
-
improving download infra
Does something like casync (https://github.com/systemd/casync or https://github.com/folbricht/desync) serve any purpose or provide any advantage to propagating rpm changes over rsync?
-
Are there any true alternatives to Seafile? (Nextcloud is not an alternative in this context)
Software that comes to mind for syncing lots of small files: git (and other source versioning tools), casync (https://github.com/systemd/casync) and a go implementation (https://github.com/folbricht/desync). Not really an answer and I can't think of a way to shoehorn that into your workflow, but maybe it leads you down a useful road.
- Casync – A Content-Addressable Data Synchronization Tool
-
Hacker News top posts: Apr 23, 2022
Casync – A Content-Addressable Data Synchronization Tool\ (15 comments)
duplicity
- Restic: Backups Done Right
- Deduplicating Archiver with Compression and Encryption
-
I recently learned about CHANGELOG and have a few questions about them
Well, depends on you. There are projects, that more or less put every commit into the CHANGELOG (there are even tools for this) or every PR. (Example)
-
Encrypted Backup Shootout
duplicity (python) - https://github.com/henrysher/duplicity
What are some alternatives?
kopia - Cross-platform backup tool for Windows, macOS & Linux with fast, incremental backups, client-side end-to-end encryption, compression and data deduplication. CLI and GUI included.
BorgBackup - Deduplicating archiver with compression and authenticated encryption.
tarsnap - Command-line client code for Tarsnap.
rclone - "rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
desync - Alternative casync implementation
restic - Fast, secure, efficient backup program
zstd - Zstandard - Fast real-time compression algorithm
magic-trace - magic-trace collects and displays high-resolution traces of what a process is doing
Duplicati - Store securely encrypted backups in the cloud!
Bup - Very efficient backup system based on the git packfile format, providing fast incremental saves and global deduplication (among and within files, including virtual machine images). Please post problems or patches to the mailing list for discussion (see the end of the README below).