hashdeep
czkawka
hashdeep | czkawka | |
---|---|---|
10 | 361 | |
680 | 17,680 | |
- | - | |
0.0 | 7.7 | |
almost 2 years ago | 11 days ago | |
C++ | Rust | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hashdeep
-
I have 2 copies of the same data in separate HDDs. Which copy should I use to create the 3rd one?
Otherwise check out Hashdeep: https://github.com/jessek/hashdeep/
- Forever version history has potential, this is an opportunity for BB
-
DS 415 play in 2022
I use hashdeep. You have to install ipkgui on Synology and then md5deep through that which includes hashdeep (or just use md5deep). You can read up on hashdeep/md5deep here in the docs folder: https://github.com/jessek/hashdeep/
-
How do I manage years of data?
Started to be able to bring some order after I discovered hashdeep. Basically I started from a reasonably clean disk with folders to sort files, created lists of hashes using hashdeep, then used it to scan all my existing disks for unknown files. With the correct flags hashdeep can list all files it finds on a disk that it has not in its lists already. That help a lot to figure out what is worth wasting time on. It also is useful because every now and then that makes me realize the copy of some old file I have is broken (probably usually because it was stored on some CDROM that was no longer good).
-
What is the best way to cold store a valuable files for decades?
MD5Deep / HashDeep - Windows and Linux options. For Windows on right under "Releases" v 4.4: https://github.com/jessek/hashdeep/
-
I need to switch away from Storage Spaces i need help deciding what to go with.
MD5Deep/HashDeep: https://github.com/jessek/hashdeep/ (download file under "Releases" on right hand side)
- Possible bitrot or similar in a folder of photos, looking for advice
-
Need Advice for Long-Term Storage
md5deep/hashdeep (https://github.com/jessek/hashdeep - see package download on side under "releases") - another command line tool, although a bit more complex but here's one way to do it:
-
Wrote This Windows Batch Script for Easy Use of HASHDEEP for MD5 Checksums
You can download hasheep from here: https://github.com/jessek/hashdeep/releases/tag/v4.4
- Maintenance for a Noob Data Hoarder Setup?
czkawka
- Is there software to compress large but similar files?
- Merge three separate partial libraries from external USB drives
-
Tools to deduplicate files
https://github.com/qarmin/czkawka by far the best of anything iv tried
-
fdupes: Identify or Delete Duplicate Files
I've used Czkawka (https://github.com/qarmin/czkawka) because it does Lanczos-based image duplicate detection, which makes it more practical for me.
-
AllDup suddenly taking forever to process/delete selections
Maybe it's a setting you made or the files, not sure. You can try another software czkawka to see if you get better results with it.
-
Is there a file duplicate finder that works with animated jpegxl-gif?
For static images i used https://github.com/qarmin/czkawka and it works well enough. I think. But when i used it on a folder with gifs and their jxl conversions, it shows nothing. SURELY this could not be user error, rrrright?
-
PhotoPrism: Browse Your Life in Pictures
I used to use DupeGuru which has some photo-specific dupe detection where you can fuzzy match image dupes based on content: https://dupeguru.voltaicideas.net/
But I switched over to czkawka, which has a better interface for comparing files, and seems to be a bit faster: https://github.com/qarmin/czkawka
Unfortunately, neither of these are integrated into Photoprism, so you still have to do some file management outside the database before importing.
I also haven't used Photoprism extensively yet (I think it's running on one of my boxes, but I haven't gotten around to setting it up), but I did find that it wasn't really built for file-based libraries. It's a little more heavyweight, but my research shows that Nextcloud Memories might be a better choice for me (it's not the first-party Nextcloud photos app, but another one put together by the community): https://apps.nextcloud.com/apps/memories
-
Please don't post like 20 similar images to the art sites?
Czkawka can do this.
-
I'm amazed how I find anything & why I have so many dupes!
There's always the well-respected tool, Czkawka. Or, of the CLI is your thing, jdupes is a good option.
- I saw a post regarding crate to delete similar files
What are some alternatives?
AntiDupl - A program to search similar and defect pictures on the disk
dupeguru - Find duplicate files
snapraid - A backup program for disk arrays. It stores parity information of your data and it recovers from up to six disk failures
jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.
cshatag - Detect silent data corruption under Linux using sha256 stored in extended attributes
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
RHash - Great utility for computing hash sums
k4dirstat - K4DirStat (KDE Directory Statistics) is a small utility program that sums up disk usage for directory trees, very much like the Unix 'du' command. It displays the disk space used up by a directory tree, both numerically and graphically (copied from the Debian package description).
PhotoPrism - AI-Powered Photos App for the Decentralized Web 🌈💎✨
darktable - darktable is an open source photography workflow application and raw developer
datacurator-filetree - a standard filetree for /r/datacurator [ and r/datahoarder ]