mpifileutils
dupd
mpifileutils | dupd | |
---|---|---|
4 | 1 | |
160 | 109 | |
0.6% | - | |
5.1 | 0.0 | |
21 days ago | 11 months ago | |
C | C | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mpifileutils
-
Pigz: A parallel implementation of gzip for multi-core machines
If you ever run into the limitations of a single machine, dbz2 is also a fun little app for this sort of thing. You can run it across multiple machines and it'll automatically balance the workload across them.
https://github.com/hpc/mpifileutils/blob/master/man/dbz2.1
- MpiFileUtils: File utilities designed for scalability and performance
-
Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files
If you want something that scales horizontally, dcmp from https://github.com/hpc/mpifileutils is an option.
- You can list a directory containing 8M files, but not with ls
dupd
-
Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files
I use and test assorted duplicate finders regularly.
fdupes is the classic (going way way back) but it's really very slow, not worth using anymore.
The four I know are worth trying these days (depending on data set, hardware, file arrangement and other factors, any one of these might be fastest for a specific use case) are https://github.com/jbruchon/jdupes , https://github.com/pauldreik/rdfind , https://github.com/jvirkki/dupd , https://github.com/sahib/rmlint
Had not encountered fclones before, will give it a try.
What are some alternatives?
fclones - Efficient Duplicate File Finder
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.
pigz - A parallel implementation of gzip for modern multi-processor, multi-core machines.
go-find-duplicates - Find duplicate files (photos, videos, music, documents) on your computer, portable hard drives etc.
duphard - A simple utility to detect duplicate files and replace them with hard links.
rdfind - find duplicate files utility
coreutils - Enhancements to the GNU coreutils (especiall head)
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.