snapraid
hashdeep
Our great sponsors
snapraid | hashdeep | |
---|---|---|
86 | 10 | |
1,841 | 680 | |
- | - | |
6.7 | 0.0 | |
3 months ago | almost 2 years ago | |
C | C++ | |
GNU General Public License v3.0 only | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
snapraid
-
Storage software with the features of Unraid but runs on Debian with cli interface?
Would mergerfs and snapraid work for you? You'd sacrifice a disk to parity and run the parity calc manually, but you could set up a cron job for that.
- Data storage solution for "archival" purposes.
-
The Next Gen Database Servers Powering Let's Encrypt(2021)
Like most people on r/homelab, it started out with Plex. Rough timeline/services below:
0. Got a Synology DS413 with 4x WD Red 3TB drives. Use Playstation Media Server to stream videos from it. Eventually find some Busybox stuff to add various functionality to the NAS, but it had a habit of undoing them periodically, which was frustrating. I also experienced my first and (knock on wood) only drive failure during this time, which concluded without fanfare once the faulty drive was replaced, and the array repaired itself.
1. While teaching self Python as an Electrical Distribution Engineer at a utility, I befriended the IT head, who gave me an ancient (I think Nehalem? Quad-core Xeon) Dell T310. Promptly got more drives, totaling 7, and tried various OS / NAS platforms. I had OpenMediaVault for a while, but got tired of the UI fighting me when I knew how to do things in shell, so I switched to Debian (which it's based on anyway). Moved to MergerFS [0] + SnapRAID [1] for storage management, and Plex for media. I was also tinkering with various Linux stuff on it constantly.
1.1 Got tired of my tinkering breaking things and requiring troubleshooting/fixing (in retrospect, this provided excellent learning), so I installed Proxmox, reinstalled Debian, and made a golden image with everything set up as desired so I could easily revert.
1.2 A friend told me about Docker. I promptly moved Plex over to it, and probably around this time also got the *Arr Stack [2] going.
2. Got a Supermicro X9DRi-LN4F+ in a 2U chassis w/ 12x 3.5" bays. Got faster/bigger CPUs (E5-2680v2), more RAM, more drives, etc. Shifted container management to Docker Compose. Modded the BIOS to allow it to boot from a NVMe drive on a PCIe adapter.
2.1 Shifted to ZFS on Debian. Other than DKMS occasionally losing its mind during kernel upgrades, this worked well.
2.2 Forked [3] some [4] Packer/Ansible projects to suit my needs, made a VM for everything. NAS, Dev, Webserver, Docker host, etc. Other than outgrowing (IMO) MergerFS/SnapRAID, honestly at this point I could have easily stopped, and could to this day revert back to this setup. It was dead reliable and worked extremely well. IIRC I was also playing with Terraform at this time.
2.3 Successfully broke into tech (Associate SRE) as a mid-career shift, due largely (according to the hiring manager) to what I had done with my homelab. Hooray for hobbies paying off.
3. Got a single Dell R620. I think the idea was to install either pfSense or VyOS on it, but that never came to fruition. Networking was from a Unifi USG (their tiny router + firewall + switch) and 8-port switch, with some AC Pro APs.
4. Got two more R620s. Kubernetes all the things. Each one runs Proxmox in a 3-node cluster with two VMs - a control plane, and worker.
4.0.1 Perhaps worth noting here that I thoroughly tested my migration plan via spinning up some VMs in, IIRC, Digital Ocean that mimicked my home setup. I successfully ran it twice, which was good enough for me.
4.1 Played with Ceph via Rook, but a. disliked (and still to this day) running storage for everything out of K8s b. kept getting clock skew between nodes. Someone on Reddit mentioned it was my low-power C-state settings, but since that was saving me something like ~50 watts/node, I didn't want to deal with the higher power/heat. I landed on Longhorn [5] for cluster storage (i.e. anything that wasn't being handled by the ZFS pool), which was fine, but slow. SATA SSDs (used Intel enterprise drives with PLP, if you're wondering) over GBe aren't super fast, but they should be able to exceed 30 MBps.
4.1.1 Again, worth noting that I spent literally a week poring over every bit of Ceph documentation I could find, from the Red Hat stuff to random Wikis and blog posts. It's not something you just jump into, IMO, and most of the horror stories I read boiled down to "you didn't follow the recommended practices."
5. Got a newer Supermicro, an X11SSH-F, thinking that it would save power consumption over the older dual-socket I had for the NAS. It turned out to not make a big difference. For some reason I don't recall, I had a second X9DRi-LN4F+ mobo, so I sold the other one with the faster CPUs, bought some cheaper CPUs for the other one, and bought more drives for it. It's now a backup target that boots up daily to ingest ZFS snapshots. I have 100% on-site backups for everything. Important things (i.e. anything that I can't get from a torrent) are also off-site.
6. Got some Samsung PM863 NVMe SSDs mounted on PCIe adapters for the Dells, and set up Ceph, but this time handled by Proxmox. It's dead easy, and for whatever reason isn't troubled by the same clock skew issues as I had previously. Still in the process of shifting cluster storage from Longhorn, but I have been successfully using Ceph block storage as fast (1 GBe, anyway - a 10G switch is on the horizon) storage for databases.
So specifically, you asked what I do with the hardware. What I do, as far as my family is concerned, is block ads and serve media. On a more useful level, I try things out related to my job, most recently database-related (I moved from SRE to DBRE a year ago). I have MySQL and Postgres running, and am constantly playing with them. Can you actually do a live buffer pool resize in MySQL? (yes) Is XFS actually faster than ext4 for large DROP TABLE operations? (yes, but not by much) Is it faster to shut down a MySQL server and roll back to a previous ZFS snapshot than to rollback a big transaction? (often yes, although obviously a full shutdown has its own problems) Does Postgres suffer from the same write performance issue as MySQL with random PKs like UUIDv4, despite not clustering by default? (yes, but not to the same extent - still enough to matter, and you should use UUIDv7 if you absolutely need them)
I legitimately love this stuff. I could quite easily make do without a fancy enclosed rack and multiple servers, but I like them, so I have them. The fact that it tends to help my professional growth out at the same time is a bonus.
[0]: https://github.com/trapexit/mergerfs
[1]: https://www.snapraid.it
[2]: https://wiki.servarr.com
[3]: https://github.com/stephanGarland/packer-proxmox-templates
[4]: https://github.com/stephanGarland/ansible-initial-server
[5]: https://longhorn.io
-
Bitrot protection with BTRFS and Rsync
If you are using OpenMediaVault, checkout SnapRaid plugin.
-
Does this count?
I used drivepool for years with snapraid before I switched over to unraid. Nothing but good things to say about either program. Highly recommend both.
-
Merge/Raid HDD documentation
You can always use SnapRAID . there is no user interface, it is CLI. also you have to sync it manually. or set up a cronjob. you loose a hdd like unRaid or RAID5 but it gives you parity. then you could always use duplicati and backblaze business to make backups. it isnt as expensive as you would think for a homelab. the first back up might be a little much but then its pennies after that
-
Converting my old pc to a backup solution
As for the drives I'm thinking of grabbing a few from ServerPartDeals and upgrading my setup that uses DrivePool and snapRAID, but in Linux you would use mergerfs instead of DrivePool.
-
Thinking of switching from a 4 bay hardware RAID 5 to an 8 bay JBOD. Looking for opinions.
I myself prescribe to the teachings of the IronicBadger(Alex Kretzshmar) from the Self-Hosted podcast and (when I get one setup) intend to follow the guides on his site https://perfectmediaserver.com and use mergerfs to turn a JBOD to a single filesystem and use SnapRAID for redundancy.
-
WWYD? Help choosing b/w NAS, DAS or micro PC case?
I have that exact Sabrent 5 bay enclosure and I replaced it with this Orico 5 bay enclosure because the Sabrent's connection would fail while I ran snapraid sync, but the Orico has never failed me.
-
Just ordered 2 20TB CMR HDD’s and I’m extremely excited. What RAID method should I use?
With two drives? Just run them as-is. Once you get a third disk, set up https://www.snapraid.it. Then as you add more, just follow https://www.snapraid.it/faq#howmanypar to have the correct number of parity disks. Or don't, Plex content is probably easy to re-acquire if you need to so having redundancy or a backup isn't all that important.
hashdeep
-
I have 2 copies of the same data in separate HDDs. Which copy should I use to create the 3rd one?
Otherwise check out Hashdeep: https://github.com/jessek/hashdeep/
- Forever version history has potential, this is an opportunity for BB
-
DS 415 play in 2022
I use hashdeep. You have to install ipkgui on Synology and then md5deep through that which includes hashdeep (or just use md5deep). You can read up on hashdeep/md5deep here in the docs folder: https://github.com/jessek/hashdeep/
-
How do I manage years of data?
Started to be able to bring some order after I discovered hashdeep. Basically I started from a reasonably clean disk with folders to sort files, created lists of hashes using hashdeep, then used it to scan all my existing disks for unknown files. With the correct flags hashdeep can list all files it finds on a disk that it has not in its lists already. That help a lot to figure out what is worth wasting time on. It also is useful because every now and then that makes me realize the copy of some old file I have is broken (probably usually because it was stored on some CDROM that was no longer good).
-
What is the best way to cold store a valuable files for decades?
MD5Deep / HashDeep - Windows and Linux options. For Windows on right under "Releases" v 4.4: https://github.com/jessek/hashdeep/
-
I need to switch away from Storage Spaces i need help deciding what to go with.
MD5Deep/HashDeep: https://github.com/jessek/hashdeep/ (download file under "Releases" on right hand side)
- Possible bitrot or similar in a folder of photos, looking for advice
-
Need Advice for Long-Term Storage
md5deep/hashdeep (https://github.com/jessek/hashdeep - see package download on side under "releases") - another command line tool, although a bit more complex but here's one way to do it:
-
Wrote This Windows Batch Script for Easy Use of HASHDEEP for MD5 Checksums
You can download hasheep from here: https://github.com/jessek/hashdeep/releases/tag/v4.4
- Maintenance for a Noob Data Hoarder Setup?
What are some alternatives?
mergerfs-tools - Optional tools to help manage data in a mergerfs pool
AntiDupl - A program to search similar and defect pictures on the disk
Elucidate - Elucidate: A GUI to drive the SnapRAID command line (via .Net)
cshatag - Detect silent data corruption under Linux using sha256 stored in extended attributes
dupeguru - Find duplicate files
RHash - Great utility for computing hash sums
MultiPar - Parchive tool
k4dirstat - K4DirStat (KDE Directory Statistics) is a small utility program that sums up disk usage for directory trees, very much like the Unix 'du' command. It displays the disk space used up by a directory tree, both numerically and graphically (copied from the Debian package description).
mergerfs - a featureful union filesystem
Rsnapshot - a tool for backing up your data using rsync (if you want to get help, use https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss)
snapraid-btrfs - Script for using snapraid with btrfs snapshots
restic - Fast, secure, efficient backup program