mpifileutils
kindfs
mpifileutils | kindfs | |
---|---|---|
4 | 2 | |
160 | 3 | |
0.6% | - | |
5.1 | 4.1 | |
21 days ago | 7 months ago | |
C | Python | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mpifileutils
-
Pigz: A parallel implementation of gzip for multi-core machines
If you ever run into the limitations of a single machine, dbz2 is also a fun little app for this sort of thing. You can run it across multiple machines and it'll automatically balance the workload across them.
https://github.com/hpc/mpifileutils/blob/master/man/dbz2.1
- MpiFileUtils: File utilities designed for scalability and performance
-
Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files
If you want something that scales horizontally, dcmp from https://github.com/hpc/mpifileutils is an option.
- You can list a directory containing 8M files, but not with ls
kindfs
-
fdupes: Identify or Delete Duplicate Files
fdupes is really nice and fast, but (as far as I remember) it was lacking two features that I needed for my use case, which were 1°/ list duplicate dirs (without listing all of the duplicate sub-contents), and 2°/ being able to identify that all the contents in one dir would be included in another part of the FS (regardless of files/dir structures), which is particularly useful when you have a bigmess/ directory that you progressively sort-out in a clean/ directory. Said differently : fdupes helps to regain space but was not able to help me much to cleanup a messy drive...
This is why I wrote https://github.com/karteum/kindfs (which indexes the fs into an sqlite DB and then enables various ways to process it).
-
Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files
FWIW if people are interested, I wrote https://github.com/karteum/kindfs for the purpose of indexing the hard drive, with the following goals
What are some alternatives?
fclones - Efficient Duplicate File Finder
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
rdfind - find duplicate files utility
pigz - A parallel implementation of gzip for modern multi-processor, multi-core machines.
jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.
duphard - A simple utility to detect duplicate files and replace them with hard links.
dude - Duplicates Detector is a cross-platform GUI utility for finding duplicate files, allowing you to delete or link them to save space. Duplicate files are displayed and processed on two synchronized panels for efficient and convenient operation.
coreutils - Enhancements to the GNU coreutils (especiall head)
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
czkawka - Multi functional app to find duplicates, empty folders, similar images etc.