Dedupe

Open-source projects categorized as Dedupe

Top 13 Dedupe Open-Source Projects

  • restic

    Fast, secure, efficient backup program

    Project mention: Ask HN: What is your approach for managing personal digital assets? | news.ycombinator.com | 2024-03-24

    I religiously use Google contacts. It's the simplest way to keep people contacts up to date on Android.

    I archive all important documents in specific folders by subject and date. This is backed up to back blaze with restic. https://restic.net/

    I use https://ente.io for pictures. I convinced my wife to use it, and she agreed to auto share her photos so I don't nag her for copies. It had simple import from Facebook and Google.

    I also keep extensive journals, which really helps to tie it all together. I can basically grep for hangouts, conversations, etc.

    I also separate work journal from personal, and have essentially a journal for each project. https://jodavaho.io/tags/bullet-journal.html for how.

    I religiously use Google calendar for all plans, you can easily search it for past events to get dates.

    I also use monicahq for some notes about things I should remember about people but the habit never stuck.

  • BorgBackup

    Deduplicating archiver with compression and authenticated encryption.

    Project mention: I Backup | news.ycombinator.com | 2024-02-27
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • dedupe

    :id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

    Project mention: Using deep learning for Fuzzy Matching | /r/datascience | 2023-07-06
  • yarn-deduplicate

    Deduplication tool for yarn.lock files

  • zingg

    Scalable identity resolution, entity resolution, data mastering and deduplication using ML

  • duplicut

    Remove duplicates from MASSIVE wordlist, without sorting it (for dictionary-based password cracking)

  • bees

    Best-Effort Extent-Same, a btrfs dedupe agent

    Project mention: Converted ext4 to btrfs, tried defrag and ran out of space | /r/btrfs | 2023-05-26

    Btrfs defrag 'will break up the reflinks of COW data' and 'may cause considerable increase of space usage depending on the broken up reflinks'. To try to fix this, I would run bees to try and deduplicate the now duplicate reflinks. It may be worth doing this from e.g. a livedisk though as out of space errors can cause things to break (so don't upgrade packages till you fix this).

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • imgdupes

    Identifying and removing near-duplicate images using perceptual hashing.

    Project mention: Reverse Image Search Local Files? (NOT A DUPLICATE FINDER) | /r/software | 2023-05-22
  • dupe-krill

    A fast file deduplicator

  • dduper

    Fast block-level out-of-band BTRFS deduplication tool.

  • daxus

    Daxus is a server state management library for React that provides full control over data, leading to a better user experience.

    Project mention: Enhancing User Experience with Daxus | dev.to | 2023-07-29

    Daxus is an exceptional server state management library tailored for React applications. With Daxus, developers have complete control over their data, allowing them to craft websites with superior user experiences.

  • swuniq

    A command-line tool for deduplicating entries in a file or stream with constant memory usage

  • Deduper

    The goal of this project is to make a deduper program that anybody can run on their computer to save storage space. (by ThatOneShortGuy)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-03-24.

Dedupe related posts

Index

What are some of the best open-source Dedupe projects? This list will help you:

Project Stars
1 restic 23,429
2 BorgBackup 10,422
3 dedupe 3,960
4 yarn-deduplicate 1,366
5 zingg 868
6 duplicut 777
7 bees 574
8 imgdupes 328
9 dupe-krill 181
10 dduper 162
11 daxus 89
12 swuniq 5
13 Deduper 0
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com