rdfind
duphard
rdfind | duphard | |
---|---|---|
16 | 1 | |
875 | 2 | |
- | - | |
4.1 | 0.0 | |
about 1 month ago | almost 4 years ago | |
C++ | Go | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
rdfind
- Rdfind: A utilty to find duplicate files, delete them or replace with hardlinks
-
Self hosted, web gui, file duplication scanner
I use rdfind for this.
-
Is there a Mac app that will allow me to recursively go through thousands of folders, calculate the total folder size, then compare against all other folder sizes, and if the size is identical, delete the newer one?
rdfind is available for macOS; I've been using it on linux: https://github.com/pauldreik/rdfind
-
Deduplication on EXT4
You can use rdfind to find all duplicates in your experiments dir and replace files with hardlinks. This way files will occupy disk space only once and all inode references will be to the same disk location.
- How do I show non-duplicate files across 2 drives?
-
Pip and cargo are not the same
I use rdfind to deal with this: https://github.com/pauldreik/rdfind
- Backing Up Data: Tips/Advice for Tons of Unorganized Data and Duplicate Files from Multiple Sources
-
This has probably happened to all of us at least once
Yeah, I periodically download the full drives and just deduplicate with rdfind hardlinking identical files.
- AMD/Xilinx Vivado rant
-
recommends for de-duplication?
I use rdfind on my Linux NAS. https://github.com/pauldreik/rdfind
duphard
-
Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files
For example I maintain a tar file and a docker image with Kafka connectors which share many jar files. Using duphard I can save hundreds of megabytes, or even more than a gigabyte! For a documentation website with many copies of the same image (let's just say some static generators favor this practice for maintaining multiple versions), I can reduce the website size by 60%+, which then makes ssh copies, docker pulls, etc way faster speeding up deployment times.
https://github.com/andmarios/duphard
What are some alternatives?
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
mpifileutils - File utilities designed for scalability and performance.
jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.
fclones - Efficient Duplicate File Finder
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
go-find-duplicates - Find duplicate files (photos, videos, music, documents) on your computer, portable hard drives etc.
dupeguru - Find duplicate files
czkawka - Multi functional app to find duplicates, empty folders, similar images etc.
kindfs - Index filesystem into a database, then easily make queries e.g. to find duplicates files/dirs, or mount the index with FUSE.
dupd - CLI utility to find duplicate files