C Deduplication

Open-source C projects categorized as Deduplication Edit details

Top 7 C Deduplication Projects

  • libpostal

    A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

    Project mention: [P] Better ways to clean lots of text? | reddit.com/r/MachineLearning | 2022-06-25

    use an address parser library like libpostal https://github.com/openvenues/libpostal

  • rmlint

    Extremely fast tool to remove duplicates and other lint from your filesystem

    Project mention: deleting duplicates programs? | reddit.com/r/commandline | 2022-06-10

    rmlint, my friend, is the last tool you will ever need for this

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • jdupes

    A powerful duplicate file finder and an enhanced fork of 'fdupes'.

    Project mention: Looking for a software or script to find duplicates | reddit.com/r/sysadmin | 2022-07-29

    Our current go-to deduplication tool for both Linux and Windows is jdupes, an optimized fork of the older fdupes. Most Linux distros should have both in repos, but if you only have fdupes, that's perfectly fine.

  • kvdo

    A pair of kernel modules which provide pools of deduplicated and/or compressed block storage.

    Project mention: How do I keep custom kernel modules working after kernel upgrade ? (Debian 11) | reddit.com/r/linuxquestions | 2022-07-20

    This is intended for building an rpm, but as far as I can see, you can steal the dkms commands and config file out of it: https://github.com/dm-vdo/kvdo/blob/master/kvdo.spec

  • vdo

    Userspace tools for managing VDO volumes.

    Project mention: "lvextend" freezes my Debian server | reddit.com/r/linuxquestions | 2022-05-31

    It is true that I have not had time to concentrate fully on this issue in the last days, and it is difficult for me to understand some of the subtleties of the subject (english is not my main language). I would like to thank you for your help and patience! I really appreciate it. FYI, I will continue to work on this issue and will update this topic. I've also asked the VDO development team to take a look at it and to tell me if they have any ideas, you can see that discussion here.

  • dupd

    CLI utility to find duplicate files

    Project mention: Go Find Duplicates: blazingly-fast simple-to-use tool to find duplicate files | news.ycombinator.com | 2021-08-29

    I use and test assorted duplicate finders regularly.

    fdupes is the classic (going way way back) but it's really very slow, not worth using anymore.

    The four I know are worth trying these days (depending on data set, hardware, file arrangement and other factors, any one of these might be fastest for a specific use case) are https://github.com/jbruchon/jdupes , https://github.com/pauldreik/rdfind , https://github.com/jvirkki/dupd , https://github.com/sahib/rmlint

    Had not encountered fclones before, will give it a try.

  • swuniq

    A command-line tool for deduplicating entries in a file or stream with constant memory usage

  • SonarLint

    Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-07-29.

C Deduplication related posts


What are some of the best open-source Deduplication projects in C? This list will help you:

Project Stars
1 libpostal 3,536
2 rmlint 1,360
3 jdupes 1,185
4 kvdo 198
5 vdo 158
6 dupd 95
7 swuniq 2
Find remote jobs at our new job board 99remotejobs.com. There are 3 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives