Our great sponsors
-
rsync
An open source utility that provides fast incremental file transfer. It also has useful features for backup and restore operations among many other use cases.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
For CLI I'd say rmlint. (There is supposed to be a GUI but I was never able to get it working. YMMV.) The dev is in the sub sometimes. Very powerful. Someone mentioned checksums. This program can do checksums then embed them in the metadata for the file, meaning they don't have to be recalculated in the future if the file hasn't been changed. So rescans are fast. As previous, don't go to crazy with huge scans until you really know what you are doing.
I have also heard people talking about using other programs that have reduplication built in as a way to accomplish this, most notable rsync and also borg backup. These require a bit more confidence in one's skills than I have at the moment for the task at hand.
Another CLI tool you should know about when dealing with large amounts of photos, it's useful in different ways, is exiftool.
The best thing to use with a graphical interface is DupeGuru. It is free and open source. It has a specific mode for photos but I don't love it, I just use the regular mode. I advice that you do not attempt to do everything in one batch. The results are overwhelming. Try doing it in pieces. Also in this way you might be able to establish a "master" copy to compare everything else to.