-
dedupe
Deduplicate files within a given list of directories by keeping one copy and making the rest hard-links. (by Gumnos)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I wrote https://github.com/Gumnos/dedupe which sounds like it might be useful to you. It's faster than several of the alternatives I've found (many run the checksum across the whole of every file, this uses the file-size as a first-line discriminator, and only if the files are the same size does it go to the trouble of checking the checksum of the files). I designed it for creating hard-links in my media collection, but in the --dry-run mode, it should emit the file-names allowing you to pass it to xargs to remove them if it looks copacetic.
Related posts
-
More Low-Bit LLMs
-
Kolmogorov-Arnold Network for Reinforcement Leaning, Initial Experiments
-
Create an AI prototyping environment using Jupyter Lab IDE with Typescript, LangChain.js and Ollama for rapid AI prototyping
-
Show HN: FileKitty – Combine and label text files for LLM prompt contexts
-
Effortlessly Create an AI Dungeon Master Bot Using Julep and Chainlit