czkawka
jdupes
czkawka | jdupes | |
---|---|---|
364 | 44 | |
20,433 | 1,681 | |
- | - | |
7.4 | 0.0 | |
about 2 months ago | about 1 year ago | |
Rust | C | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
czkawka
-
Ask HN: How do you deduplicate files?
You want content-addressed storage; this works with rolling content hashes that identify common blocks of memory. `rsync` uses that technique to minimize bytes to be transferred. https://github.com/qarmin/czkawka is a GUI app and CLI tool to find identical files in general and similar images in particular.
The task is much simpler if you only want to find bit-identical entire files, not part of files; in that case, you can just run a tool like `sha1sum` over each file and record the hash digest in a database; identical files—and only identical ones, with high probability—will have the same hash, non-identical ones will have different hashes.
- Czkawka: Multi functional app to find duplicates, empty folders, similar images
-
Duperemove – Tools for deduping file systems
You might be interested in this app: https://github.com/qarmin/czkawka
- Is there software to compress large but similar files?
- Merge three separate partial libraries from external USB drives
-
Tools to deduplicate files
https://github.com/qarmin/czkawka by far the best of anything iv tried
-
fdupes: Identify or Delete Duplicate Files
I've used Czkawka (https://github.com/qarmin/czkawka) because it does Lanczos-based image duplicate detection, which makes it more practical for me.
-
AllDup suddenly taking forever to process/delete selections
Maybe it's a setting you made or the files, not sure. You can try another software czkawka to see if you get better results with it.
-
Is there a file duplicate finder that works with animated jpegxl-gif?
For static images i used https://github.com/qarmin/czkawka and it works well enough. I think. But when i used it on a folder with gifs and their jxl conversions, it shows nothing. SURELY this could not be user error, rrrright?
-
PhotoPrism: Browse Your Life in Pictures
I used to use DupeGuru which has some photo-specific dupe detection where you can fuzzy match image dupes based on content: https://dupeguru.voltaicideas.net/
But I switched over to czkawka, which has a better interface for comparing files, and seems to be a bit faster: https://github.com/qarmin/czkawka
Unfortunately, neither of these are integrated into Photoprism, so you still have to do some file management outside the database before importing.
I also haven't used Photoprism extensively yet (I think it's running on one of my boxes, but I haven't gotten around to setting it up), but I did find that it wasn't really built for file-based libraries. It's a little more heavyweight, but my research shows that Nextcloud Memories might be a better choice for me (it's not the first-party Nextcloud photos app, but another one put together by the community): https://apps.nextcloud.com/apps/memories
jdupes
-
File Servers... how are you handling duplicates
I recommend the use of jdupes, a fork of the well-known fdupes, to find duplicate files.
-
fdupes: Identify or Delete Duplicate Files
200 lines of Nim [1] seems to run about 9X faster than the 8000 lines of C in fdupes on a little test dir I have. If you need C, I think jdupes [2] is faster as @TacticalCoder points out a couple of times here. In my testing, `dups` is usually faster than `jdupes`, though.
[1] https://github.com/c-blake/bu/blob/main/dups.nim
[2] https://github.com/jbruchon/jdupes
-
I'm amazed how I find anything & why I have so many dupes!
There's always the well-respected tool, Czkawka. Or, of the CLI is your thing, jdupes is a good option.
- Anyone know of any good file deduplication tools?
-
Johnny Decimal
My research into this many years ago turned out that jdupes was the right / best solution I could find for my usecase.
https://github.com/jbruchon/jdupes
Though that works fine from a script perspective I'd like some more interactive way of sorting directories etc. Identifying is just the first step, jdupes helps with linking the files (both soft and hard links comes with caveats though!) but that is mostly to save space, not to help in reorganisation.
- Jdupes: A powerful duplicate file finder
-
Does jdupes do a 'dry run' if you just specify directory(s) and no other options
I can work it out by looking at https://github.com/jbruchon/jdupes.
-
replace duplicates with hard links - I think jdupes is the answer, or maybe fclones (I have questions)
I have looked at a few alternatives and think jdupes is the one for me. Then I found out it was not multi-threaded so will give it a go but the developer of jdupes recomended fclones (https://github.com/jbruchon/jdupes/issues/186) if you were dealing with large file systems and wanted multi-threading. But as I am using a HD it may not be necessary.
-
De-Duping a file server
jdupes is a fork of the old standby fdupes, but it has a Win32 release as well as supporting POSIX.
-
Any good duplicate file finder for windows?
jdupes is a tuned fork of the well-known fdupes, and has Win32 releases.
What are some alternatives?
dupeguru - Find duplicate files
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
AntiDupl - A program to search similar and defect pictures on the disk
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
PhotoPrism - AI-Powered Photos App for the Decentralized Web 🌈💎✨
rdfind - find duplicate files utility
darktable - darktable is an open source photography workflow application and raw developer
duperemove - Tools for deduping file systems
datacurator-filetree - a standard filetree for /r/datacurator [ and r/datahoarder ]
fclones - Efficient Duplicate File Finder