hashdeep
czkawka
hashdeep | czkawka | |
---|---|---|
10 | 364 | |
719 | 21,561 | |
0.0% | 3.0% | |
0.0 | 6.9 | |
2 months ago | 17 days ago | |
C++ | Rust | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hashdeep
-
I have 2 copies of the same data in separate HDDs. Which copy should I use to create the 3rd one?
Otherwise check out Hashdeep: https://github.com/jessek/hashdeep/
- Forever version history has potential, this is an opportunity for BB
-
DS 415 play in 2022
I use hashdeep. You have to install ipkgui on Synology and then md5deep through that which includes hashdeep (or just use md5deep). You can read up on hashdeep/md5deep here in the docs folder: https://github.com/jessek/hashdeep/
-
How do I manage years of data?
Started to be able to bring some order after I discovered hashdeep. Basically I started from a reasonably clean disk with folders to sort files, created lists of hashes using hashdeep, then used it to scan all my existing disks for unknown files. With the correct flags hashdeep can list all files it finds on a disk that it has not in its lists already. That help a lot to figure out what is worth wasting time on. It also is useful because every now and then that makes me realize the copy of some old file I have is broken (probably usually because it was stored on some CDROM that was no longer good).
-
What is the best way to cold store a valuable files for decades?
MD5Deep / HashDeep - Windows and Linux options. For Windows on right under "Releases" v 4.4: https://github.com/jessek/hashdeep/
-
I need to switch away from Storage Spaces i need help deciding what to go with.
MD5Deep/HashDeep: https://github.com/jessek/hashdeep/ (download file under "Releases" on right hand side)
- Possible bitrot or similar in a folder of photos, looking for advice
-
Need Advice for Long-Term Storage
md5deep/hashdeep (https://github.com/jessek/hashdeep - see package download on side under "releases") - another command line tool, although a bit more complex but here's one way to do it:
-
Wrote This Windows Batch Script for Easy Use of HASHDEEP for MD5 Checksums
You can download hasheep from here: https://github.com/jessek/hashdeep/releases/tag/v4.4
- Maintenance for a Noob Data Hoarder Setup?
czkawka
-
Ask HN: How do you deduplicate files?
You want content-addressed storage; this works with rolling content hashes that identify common blocks of memory. `rsync` uses that technique to minimize bytes to be transferred. https://github.com/qarmin/czkawka is a GUI app and CLI tool to find identical files in general and similar images in particular.
The task is much simpler if you only want to find bit-identical entire files, not part of files; in that case, you can just run a tool like `sha1sum` over each file and record the hash digest in a database; identical files—and only identical ones, with high probability—will have the same hash, non-identical ones will have different hashes.
- Czkawka: Multi functional app to find duplicates, empty folders, similar images
-
Duperemove – Tools for deduping file systems
You might be interested in this app: https://github.com/qarmin/czkawka
- Is there software to compress large but similar files?
- Merge three separate partial libraries from external USB drives
-
Tools to deduplicate files
https://github.com/qarmin/czkawka by far the best of anything iv tried
-
fdupes: Identify or Delete Duplicate Files
I've used Czkawka (https://github.com/qarmin/czkawka) because it does Lanczos-based image duplicate detection, which makes it more practical for me.
-
AllDup suddenly taking forever to process/delete selections
Maybe it's a setting you made or the files, not sure. You can try another software czkawka to see if you get better results with it.
-
Is there a file duplicate finder that works with animated jpegxl-gif?
For static images i used https://github.com/qarmin/czkawka and it works well enough. I think. But when i used it on a folder with gifs and their jxl conversions, it shows nothing. SURELY this could not be user error, rrrright?
-
PhotoPrism: Browse Your Life in Pictures
I used to use DupeGuru which has some photo-specific dupe detection where you can fuzzy match image dupes based on content: https://dupeguru.voltaicideas.net/
But I switched over to czkawka, which has a better interface for comparing files, and seems to be a bit faster: https://github.com/qarmin/czkawka
Unfortunately, neither of these are integrated into Photoprism, so you still have to do some file management outside the database before importing.
I also haven't used Photoprism extensively yet (I think it's running on one of my boxes, but I haven't gotten around to setting it up), but I did find that it wasn't really built for file-based libraries. It's a little more heavyweight, but my research shows that Nextcloud Memories might be a better choice for me (it's not the first-party Nextcloud photos app, but another one put together by the community): https://apps.nextcloud.com/apps/memories
What are some alternatives?
RHash - Great utility for computing hash sums
dupeguru - Find duplicate files
k4dirstat - K4DirStat (KDE Directory Statistics) is a small utility program that sums up disk usage for directory trees, very much like the Unix 'du' command. It displays the disk space used up by a directory tree, both numerically and graphically (copied from the Debian package description).
jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.
cshatag - Detect silent data corruption under Linux using sha256 stored in extended attributes
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
AntiDupl - A program to search similar and defect pictures on the disk
snapraid - A backup program for disk arrays. It stores parity information of your data and it recovers from up to six disk failures
PhotoPrism - AI-Powered Photos App for the Decentralized Web 🌈💎✨
bleachbit - BleachBit system cleaner for Windows and Linux