jdupes
DISCONTINUED
rdfind
Our great sponsors
jdupes | rdfind | |
---|---|---|
44 | 16 | |
1,681 | 859 | |
- | - | |
0.0 | 4.9 | |
6 months ago | 11 days ago | |
C | C++ | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
jdupes
-
fdupes: Identify or Delete Duplicate Files
200 lines of Nim [1] seems to run about 9X faster than the 8000 lines of C in fdupes on a little test dir I have. If you need C, I think jdupes [2] is faster as @TacticalCoder points out a couple of times here. In my testing, `dups` is usually faster than `jdupes`, though.
-
I'm amazed how I find anything & why I have so many dupes!
There's always the well-respected tool, Czkawka. Or, of the CLI is your thing, jdupes is a good option.
- Anyone know of any good file deduplication tools?
-
Johnny Decimal
My research into this many years ago turned out that jdupes was the right / best solution I could find for my usecase.
https://github.com/jbruchon/jdupes
Though that works fine from a script perspective I'd like some more interactive way of sorting directories etc. Identifying is just the first step, jdupes helps with linking the files (both soft and hard links comes with caveats though!) but that is mostly to save space, not to help in reorganisation.
-
Any good duplicate file finder for windows?
jdupes is a tuned fork of the well-known fdupes, and has Win32 releases.
- FLaNK Stack Weekly 3 April 2023
- Backing Up Data: Tips/Advice for Tons of Unorganized Data and Duplicate Files from Multiple Sources
-
Anyone running Bees? Or deduping data some other way?
If not bees, do you run other programs for deduping? I see jdupes has support for BTRFS, https://github.com/jbruchon/jdupes, and also duperemove, https://github.com/markfasheh/duperemove.
- Ask HN: Tool to find identical file subtrees scattered over disks
rdfind
-
Self hosted, web gui, file duplication scanner
I use rdfind for this.
-
Deduplication on EXT4
You can use rdfind to find all duplicates in your experiments dir and replace files with hardlinks. This way files will occupy disk space only once and all inode references will be to the same disk location.
-
Pip and cargo are not the same
I use rdfind to deal with this: https://github.com/pauldreik/rdfind
- Backing Up Data: Tips/Advice for Tons of Unorganized Data and Duplicate Files from Multiple Sources
- Suggestions on how to identify & report on old stale data in file shares?
-
File Deduplication
Another cool tool to do deduplication is rdfind (https://github.com/pauldreik/rdfind). You could keep running it each time after you copy files from another device.
-
data hoarding software
For folks using linux, rdfind is a serially underrated tool. While it doesn't do image similarity %, it is very fast at finding exact duplicates of files
-
Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files
As far as I know, the standard tool for this is rdfind. This new tool claims to be "blazingly fast", so it should provide something to show it. Ideally a comparison with rdfind, but even a basic benchmark would make it less dubious. https://github.com/pauldreik/rdfind
But the main problem is not the suspicious performance, it's the lack of explanation. The tool is supposed to "find duplicate files (photos, videos, music, documents)". Does it mean it is restricted to some file types? Does it find identical photos with different metadata to be duplicates? Compare this with rdfind which clearly describes what it does, provides a summary of its algorithm, and even mentions alternatives.
Overall, it may be a fine toy/hobby project (3 commits only, 3 months ago), I didn't read the code (except for finding the command-line options). I don't get why it got so much attention.
I use and test assorted duplicate finders regularly.
fdupes is the classic (going way way back) but it's really very slow, not worth using anymore.
The four I know are worth trying these days (depending on data set, hardware, file arrangement and other factors, any one of these might be fastest for a specific use case) are https://github.com/jbruchon/jdupes , https://github.com/pauldreik/rdfind , https://github.com/jvirkki/dupd , https://github.com/sahib/rmlint
Had not encountered fclones before, will give it a try.
What are some alternatives?
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
dupeguru - Find duplicate files
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
duperemove - Tools for deduping file systems
czkawka - Multi functional app to find duplicates, empty folders, similar images etc.
fclones - Efficient Duplicate File Finder
phockup - Media sorting tool to organize photos and videos from your camera in folders by year, month and day.
btrfs-progs - Development of userspace BTRFS tools
cdecrypt - Decrypt Wii U NUS content — Forked from: https://code.google.com/archive/p/cdecrypt/