File Deduplication

This page summarizes the projects mentioned and recommended in the original post on /r/homelab

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • fdupes

    FDUPES is a program for identifying or deleting duplicate files residing within specified directories.

  • I recently used [fdupes](https://github.com/adrianlopezroche/fdupes) to figure out duplicate files from my amazon cloud drive / photos migration. Took about 2 days to scour through about 1.5TB worth of day.

  • scripts

    Miscellaneous scripts that serve a stand-alone purpose that might be useful for others. (by taltman)

  • Prior to settling on this approach, I found [this](https://unix.stackexchange.com/questions/277697/whats-the-quickest-way-to-find-duplicated-files) post to be very helpful. One of the respondent wrote [this](https://github.com/taltman/scripts/blob/master/unix_utils/find-dupes.awk) awk script that is supposedly very fast. However, it leverages the FreeBSD flavor of things. I [tried](https://github.com/taltman/scripts/issues/4) getting it to work on Linux, but couldn't get it to work given my awk-fu skills aren't so good.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • czkawka

    Multi functional app to find duplicates, empty folders, similar images etc.

  • You can use czkawka to find and remove duplicates. It's free, easy-to-use and pretty reliable. I also sometime use Starwinds dedupe analyzer to check if there still any data that can be deduplicated.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts