jdupes vs xxHash

jdupes

A powerful duplicate file finder and an enhanced fork of 'fdupes'. (by jbruchon)

C duplicate-files Fast Deduplication Btrfs hard-links delete-files Fdupes Windows Win32 MacOS Macosx Linux Bsd mit-license Symlinks symlink-files Hardlinking Dedupe deduplication-command

DISCONTINUED

Suggest alternative

Edit details

xxHash

Extremely fast non-cryptographic hash algorithm (by Cyan4973)

Xxhash Smhasher hash-functions C Dispersion Hash hash-checksum

Source Code

xxhash.com

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

jdupes		xxHash
	Project
44	Mentions	28
1,681	Stars	8,462
-	Growth	-
0.0	Activity	8.4
7 months ago	Latest Commit	3 days ago
C	Language	C
MIT License	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

jdupes

Posts with mentions or reviews of jdupes. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-02.

File Servers... how are you handling duplicates
1 project | /r/sysadmin | 8 Dec 2023

I recommend the use of jdupes, a fork of the well-known fdupes, to find duplicate files.
fdupes: Identify or Delete Duplicate Files
13 projects | news.ycombinator.com | 2 Nov 2023

200 lines of Nim [1] seems to run about 9X faster than the 8000 lines of C in fdupes on a little test dir I have. If you need C, I think jdupes [2] is faster as @TacticalCoder points out a couple of times here. In my testing, `dups` is usually faster than `jdupes`, though.
[1] https://github.com/c-blake/bu/blob/main/dups.nim
[2] https://github.com/jbruchon/jdupes
I'm amazed how I find anything & why I have so many dupes!
4 projects | /r/DataHoarder | 8 Jul 2023

There's always the well-respected tool, Czkawka. Or, of the CLI is your thing, jdupes is a good option.
Anyone know of any good file deduplication tools?
2 projects | /r/sysadmin | 29 Jun 2023
Johnny Decimal
4 projects | news.ycombinator.com | 13 Jun 2023

My research into this many years ago turned out that jdupes was the right / best solution I could find for my usecase.
https://github.com/jbruchon/jdupes
Though that works fine from a script perspective I'd like some more interactive way of sorting directories etc. Identifying is just the first step, jdupes helps with linking the files (both soft and hard links comes with caveats though!) but that is mostly to save space, not to help in reorganisation.
Jdupes: A powerful duplicate file finder
1 project | news.ycombinator.com | 6 Jun 2023
Does jdupes do a 'dry run' if you just specify directory(s) and no other options
1 project | /r/linuxquestions | 4 Jun 2023

I can work it out by looking at https://github.com/jbruchon/jdupes.
replace duplicates with hard links - I think jdupes is the answer, or maybe fclones (I have questions)
1 project | /r/linuxquestions | 4 Jun 2023

I have looked at a few alternatives and think jdupes is the one for me. Then I found out it was not multi-threaded so will give it a go but the developer of jdupes recomended fclones (https://github.com/jbruchon/jdupes/issues/186) if you were dealing with large file systems and wanted multi-threading. But as I am using a HD it may not be necessary.
De-Duping a file server
1 project | /r/sysadmin | 30 May 2023

jdupes is a fork of the old standby fdupes, but it has a Win32 release as well as supporting POSIX.
Any good duplicate file finder for windows?
3 projects | /r/sysadmin | 22 Apr 2023

jdupes is a tuned fork of the well-known fdupes, and has Win32 releases.

xxHash

Posts with mentions or reviews of xxHash. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-13.

The One Billion Row Challenge in CUDA: from 17 minutes to 17 seconds
5 projects | news.ycombinator.com | 13 Apr 2024

> GPU Hash Table?
How bad would performance have suffered if you sha256'd the lines to build the map? I'm going to guess "badly"?
Maybe something like this in CUDA: https://github.com/Cyan4973/xxHash ?
ETag and HTTP Caching
4 projects | news.ycombinator.com | 10 Apr 2024
Day 64: Implementing a basic Bloom Filter Using Java BitSet api
1 project | dev.to | 30 Dec 2022

Examples of fast, simple hashes that are independent enough includes murmur, xxHash, Fowler–Noll–Vo hash function and many others
Closed-addressing hashtables implementation
2 projects | /r/C_Programming | 22 Dec 2022
NIST Retires SHA-1 Cryptographic Algorithm
3 projects | news.ycombinator.com | 15 Dec 2022

If you're only using the hash for non-cryptographic applications, there are much faster hashes: https://github.com/Cyan4973/xxHash
Does the checksum algorithm crc32c-intel support AMD Ryzen series 3000 or newer?
1 project | /r/btrfs | 12 Nov 2022

I found the benchmark result of AMD ryzen 5950X
[Study Project] A memory-optimized JSON data structure
4 projects | /r/cpp | 23 Oct 2022

But what's the catch, you're thinking ? Well, it is a bit slower than its counterparts when it comes to deserializing (and marginally faster for serializing). To achieve smaller footprint, it uses a few tricks and notably a custom hash table to deduplicate strings. This comes at a cost of course (even when featuring xxHash to speed things up), but keeps the slowdown reasonable (I think).
What do you typically use for non-cryptographic hash functions?
2 projects | /r/golang | 3 Oct 2022

Non cryptographic hashes has collisions, for example, assume you having content like "abcdefg" which hashed value is "123", in case of weak hash algorithm some other content like "abcdefZ" can also have a hash "123" which basically means such hash function is failed to be unique fingerprint of particular content. BLAKE3 for example can do 6-7Gb/s which make it pretty fast and secure. If your requirement accepts collision with defined error rate, I would advise you to take a look at XXH3 if you need very snappy hash algorithm, which can run at pace or RAM access (30GB/s+), but again, run tests at particular equipment you targeting, may be AES hardware accelerated MeowHash will serve you better.
C++ gonna die😥
10 projects | /r/ProgrammerHumor | 23 Jul 2022
rsync, article 3: How does rsync work?
4 projects | news.ycombinator.com | 2 Jul 2022

What are some alternatives?

When comparing jdupes and xxHash you can also consider the following projects:

fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.

BLAKE3 - the official Rust and C implementations of the BLAKE3 cryptographic hash function

dupeguru - Find duplicate files

meow_hash - Official version of the Meow hash, an extremely fast level 1 hash

rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem

xxh - 🚀 Bring your favorite shell wherever you go through the ssh. Xonsh shell, fish, zsh, osquery and so on.

rdfind - find duplicate files utility

blake3 - An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function

czkawka - Multi functional app to find duplicates, empty folders, similar images etc.

smhasher - Hash function quality and speed tests

duperemove - Tools for deduping file systems

swift-crypto - Open-source implementation of a substantial portion of the API of Apple CryptoKit suitable for use on Linux platforms.

jdupes vs fdupes xxHash vs BLAKE3 jdupes vs dupeguru xxHash vs meow_hash jdupes vs rmlint xxHash vs xxh jdupes vs rdfind xxHash vs blake3 jdupes vs czkawka xxHash vs smhasher jdupes vs duperemove xxHash vs swift-crypto

Compare jdupes vs xxHash and see what are their differences.

jdupes

xxHash

jdupes

xxHash

What are some alternatives?