dedupe VS ripgrep

Compare dedupe vs ripgrep and see what are their differences.

dedupe

Deduplicate files within a given list of directories by keeping one copy and making the rest hard-links. (by Gumnos)

ripgrep

ripgrep recursively searches directories for a regex pattern while respecting your gitignore (by BurntSushi)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
dedupe ripgrep
3 348
3 45,040
- -
0.0 9.3
over 6 years ago 13 days ago
Python Rust
BSD 2-clause "Simplified" License The Unlicense
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

dedupe

Posts with mentions or reviews of dedupe. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-03-17.
  • fdupes alternatives?
    1 project | /r/commandline | 18 Jun 2022
    I wrote https://github.com/Gumnos/dedupe which sounds like it might be useful to you. It's faster than several of the alternatives I've found (many run the checksum across the whole of every file, this uses the file-size as a first-line discriminator, and only if the files are the same size does it go to the trouble of checking the checksum of the files). I designed it for creating hard-links in my media collection, but in the --dry-run mode, it should emit the file-names allowing you to pass it to xargs to remove them if it looks copacetic.
  • File Management via CLI
    7 projects | /r/commandline | 17 Mar 2022
    You can use my dedupe.py script with the dry-run flag (-n) to find all the duplicates on your drive. If you run it without the dry-run flag, it will attempt to make hard-links so that each file exists only once on the drive with multiple hard-links to the underlying file. It should be pretty fast, only needing to checksum file-content in the event that files have the same size (several other such deduplication methods work by checksumming every file on the drive which can be slow).
  • What tools / utilities have you written that you use regularly?
    42 projects | /r/commandline | 17 Sep 2021
    a file-deduplication utility that hard-links duplicate files to save space (our family photo gallery gets pics put in multiple albums for various audiences, so I can cut down on a lot of duplication with this)

ripgrep

Posts with mentions or reviews of ripgrep. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-17.
  • Ask HN: What software sparks joy when using?
    10 projects | news.ycombinator.com | 17 Apr 2024
    ripgrep - https://github.com/BurntSushi/ripgrep
  • Code Search Is Hard
    13 projects | news.ycombinator.com | 10 Apr 2024
    Basic code searching skills seems like something new developers are never explicitly taught, but which is an absolutely crucial skill to build early on.

    I guess the knowledge progression I would recommend would look something kind this:

    - Learning about Ctrl+F, which works basically everywhere.

    - Transitioning to ripgrep https://github.com/BurntSushi/ripgrep - I wouldn't even call this optional, it's truly an incredible and very discoverable tool. Requires keeping a terminal open, but that's a good thing for a newbie!

    - Optional, but highly recommended: Learning one of the powerhouse command line editors. Teenage me recommended Emacs; current me recommends vanilla vim, purely because some flavor of it is installed almost everywhere. This is so that you can grep around and edit in the same window.

    - In the same vein, moving back from ripgrep and learning about good old fashioned grep, with a few flags rg uses by default: `grep -r` for recursive search, `grep -ri` for case insensitive recursive search, and `grep -ril` for case insensitive recursive "just show me which files this string is found in" search. Some others too, season to taste.

    - Finally hitting the wall with what ripgrep can do for you and switching to an actual indexed, dedicated code search tool.

  • Level Up Your Dev Workflow: Conquer Web Development with a Blazing Fast Neovim Setup (Part 1)
    12 projects | dev.to | 16 Mar 2024
    live grep: ripgrep
  • Ripgrep
    1 project | news.ycombinator.com | 25 Feb 2024
  • Modern Java/JVM Build Practices
    9 projects | news.ycombinator.com | 4 Jan 2024
    The world has moved on though to opinionated tools, and Rust isn't even the furthest in that direction (That would be Go). The equivalent of those two lines in Cargo.toml would be this example of a basic configuration from the jacoco-maven-plugin: https://www.jacoco.org/jacoco/trunk/doc/examples/build/pom.x... - That's 40 lines in the section to do the "defaults".

    Yes, you could add a load of config for files to include/exclude from coverage and so on, but the idea that that's a norm is way more common in Java projects than other languages. Like here's some example Cargo.toml files from complicated Rust projects:

    Servo: https://github.com/servo/servo/blob/main/Cargo.toml

    rust-gdext: https://github.com/godot-rust/gdext/blob/master/godot-core/C...

    ripgrep: https://github.com/BurntSushi/ripgrep/blob/master/Cargo.toml

    socketio: https://github.com/1c3t3a/rust-socketio/blob/main/socketio/C...

  • Ugrep – a more powerful, ultra fast, user-friendly, compatible grep
    27 projects | news.ycombinator.com | 30 Dec 2023
    I'm not clear on why you're seeing the results you are. It could be because your haystack is so small that you're mostly just measuring noise. ripgrep 14 did introduce some optimizations in workloads like this by reducing match overhead, but I don't think it's anything huge in this case. (And I just tried ripgrep 13 on the same commands above and the timings are similar if a tiny bit slower.)

    [1]: https://github.com/radare/ired

    [2]: https://github.com/BurntSushi/ripgrep/discussions/2597

  • Tell HN: My Favorite Tools
    14 projects | news.ycombinator.com | 24 Dec 2023
  • Potencializando Sua Experiência no Linux: Conheça as Ferramentas em Rust para um Desenvolvimento Eficiente
    5 projects | dev.to | 12 Dec 2023
    Explore o Ripgrep no repositório oficial: https://github.com/BurntSushi/ripgrep
  • Scrybble is the ReMarkable highlights to Obsidian exporter I have been looking for
    9 projects | /r/RemarkableTablet | 7 Dec 2023
    🔎🗃️ ripgrep or ugrep (search fast, use regex patterns or fuzzy search, pipe output to bash/zsh shell for further processing V coloring)
  • RFC: Add ngram indexing support to ripgrep (2020)
    2 projects | news.ycombinator.com | 30 Nov 2023

What are some alternatives?

When comparing dedupe and ripgrep you can also consider the following projects:

ripgrep-all - rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.

telescope-live-grep-args.nvim - Live grep with args

file-arranger - Simple & capable Directory arranger/cleaner

fd - A simple, fast and user-friendly alternative to 'find'

xonsh - :shell: Python-powered, cross-platform, Unix-gazing shell.

ugrep - ugrep 5.1: A more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more

tawk - Like awk, but using tcl as the scripting language.

the_silver_searcher - A code-searching tool similar to ack, but faster.

mpd_what - An mpd album art and info getter

fzf - :cherry_blossom: A command-line fuzzy finder

ledger - Double-entry accounting system with a command-line reporting interface

alacritty - A cross-platform, OpenGL terminal emulator.