kindfs
lsdup
kindfs | lsdup | |
---|---|---|
2 | 1 | |
3 | 5 | |
- | - | |
4.1 | 10.0 | |
7 months ago | almost 2 years ago | |
Python | Rust | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kindfs
-
fdupes: Identify or Delete Duplicate Files
fdupes is really nice and fast, but (as far as I remember) it was lacking two features that I needed for my use case, which were 1°/ list duplicate dirs (without listing all of the duplicate sub-contents), and 2°/ being able to identify that all the contents in one dir would be included in another part of the FS (regardless of files/dir structures), which is particularly useful when you have a bigmess/ directory that you progressively sort-out in a clean/ directory. Said differently : fdupes helps to regain space but was not able to help me much to cleanup a messy drive...
This is why I wrote https://github.com/karteum/kindfs (which indexes the fs into an sqlite DB and then enables various ways to process it).
-
Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files
FWIW if people are interested, I wrote https://github.com/karteum/kindfs for the purpose of indexing the hard drive, with the following goals
lsdup
-
fdupes: Identify or Delete Duplicate Files
Writing a program like this is one of the first exercises I give myself when learning a new programming language, because it touches a little bit of everything (reading files, output, CLI, using libraries, hashmaps, functions, loops, conditionals, etc) and isn't too onerous to implement.
My latest (it's a few years old at this point) is lsdup (rust version) using blake3 for hashing the content: https://github.com/redsaz/lsdup/
All it does is list the groups of duplicate files, grouped by hash, groups ordered by size. I'll usually pipe the output to a file, then do whatever I want to the list, and run a different script to process the resulting list. It works fine enough.
What are some alternatives?
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
dude - Duplicates Detector is a cross-platform GUI utility for finding duplicate files, allowing you to delete or link them to save space. Duplicate files are displayed and processed on two synchronized panels for efficient and convenient operation.
rdfind - find duplicate files utility
fclones - Efficient Duplicate File Finder
jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
czkawka - Multi functional app to find duplicates, empty folders, similar images etc.
duff - Command-line utility for finding duplicate files
duperemove - Tools for deduping file systems