Top 7 C Deduplication Projects
A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.Project mention: [P] Better ways to clean lots of text? | reddit.com/r/MachineLearning | 2022-06-25
use an address parser library like libpostal https://github.com/openvenues/libpostal
Extremely fast tool to remove duplicates and other lint from your filesystemProject mention: deleting duplicates programs? | reddit.com/r/commandline | 2022-06-10
rmlint, my friend, is the last tool you will ever need for this
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
A powerful duplicate file finder and an enhanced fork of 'fdupes'.Project mention: Looking for a software or script to find duplicates | reddit.com/r/sysadmin | 2022-07-29
Our current go-to deduplication tool for both Linux and Windows is jdupes, an optimized fork of the older fdupes. Most Linux distros should have both in repos, but if you only have fdupes, that's perfectly fine.
A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.Project mention: How do I keep custom kernel modules working after kernel upgrade ? (Debian 11) | reddit.com/r/linuxquestions | 2022-07-20
This is intended for building an rpm, but as far as I can see, you can steal the dkms commands and config file out of it: https://github.com/dm-vdo/kvdo/blob/master/kvdo.spec
Userspace tools for managing VDO volumes.Project mention: "lvextend" freezes my Debian server | reddit.com/r/linuxquestions | 2022-05-31
It is true that I have not had time to concentrate fully on this issue in the last days, and it is difficult for me to understand some of the subtleties of the subject (english is not my main language). I would like to thank you for your help and patience! I really appreciate it. FYI, I will continue to work on this issue and will update this topic. I've also asked the VDO development team to take a look at it and to tell me if they have any ideas, you can see that discussion here.
CLI utility to find duplicate filesProject mention: Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files | news.ycombinator.com | 2021-08-29
I use and test assorted duplicate finders regularly.
fdupes is the classic (going way way back) but it's really very slow, not worth using anymore.
The four I know are worth trying these days (depending on data set, hardware, file arrangement and other factors, any one of these might be fastest for a specific use case) are https://github.com/jbruchon/jdupes , https://github.com/pauldreik/rdfind , https://github.com/jvirkki/dupd , https://github.com/sahib/rmlint
Had not encountered fclones before, will give it a try.
A command-line tool for deduplicating entries in a file or stream with constant memory usage
Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.
C Deduplication related posts
Audio titles not working anymore| Windows Media Audio?
2 projects | reddit.com/r/software | 31 Jul 2022
Xz format considered inadequate for long-term archiving
8 projects | news.ycombinator.com | 23 Jul 2022
How do I keep custom kernel modules working after kernel upgrade ? (Debian 11)
1 project | reddit.com/r/linuxquestions | 20 Jul 2022
DupFinder is a duplicate file finder
1 project | reddit.com/r/commandline | 4 Jul 2022
"lvextend" freezes my Debian server
1 project | reddit.com/r/linuxquestions | 31 May 2022
New LVM VDO logical volume inactive at startup, even with "--activate y" parameter
2 projects | reddit.com/r/linuxquestions | 14 May 2022
Suggestions on how to identify & report on old stale data in file shares?
4 projects | reddit.com/r/sysadmin | 10 May 2022
What are some of the best open-source Deduplication projects in C? This list will help you:
Are you hiring? Post a new remote job listing for free.