jdupes
Git
Our great sponsors
jdupes | Git | |
---|---|---|
44 | 285 | |
1,681 | 49,964 | |
- | 2.0% | |
0.0 | 10.0 | |
7 months ago | 3 days ago | |
C | C | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
jdupes
-
File Servers... how are you handling duplicates
I recommend the use of jdupes, a fork of the well-known fdupes, to find duplicate files.
-
fdupes: Identify or Delete Duplicate Files
200 lines of Nim [1] seems to run about 9X faster than the 8000 lines of C in fdupes on a little test dir I have. If you need C, I think jdupes [2] is faster as @TacticalCoder points out a couple of times here. In my testing, `dups` is usually faster than `jdupes`, though.
[1] https://github.com/c-blake/bu/blob/main/dups.nim
[2] https://github.com/jbruchon/jdupes
-
I'm amazed how I find anything & why I have so many dupes!
There's always the well-respected tool, Czkawka. Or, of the CLI is your thing, jdupes is a good option.
- Anyone know of any good file deduplication tools?
-
Johnny Decimal
My research into this many years ago turned out that jdupes was the right / best solution I could find for my usecase.
https://github.com/jbruchon/jdupes
Though that works fine from a script perspective I'd like some more interactive way of sorting directories etc. Identifying is just the first step, jdupes helps with linking the files (both soft and hard links comes with caveats though!) but that is mostly to save space, not to help in reorganisation.
- Jdupes: A powerful duplicate file finder
-
Does jdupes do a 'dry run' if you just specify directory(s) and no other options
I can work it out by looking at https://github.com/jbruchon/jdupes.
-
replace duplicates with hard links - I think jdupes is the answer, or maybe fclones (I have questions)
I have looked at a few alternatives and think jdupes is the one for me. Then I found out it was not multi-threaded so will give it a go but the developer of jdupes recomended fclones (https://github.com/jbruchon/jdupes/issues/186) if you were dealing with large file systems and wanted multi-threading. But as I am using a HD it may not be necessary.
-
De-Duping a file server
jdupes is a fork of the old standby fdupes, but it has a Win32 release as well as supporting POSIX.
-
Any good duplicate file finder for windows?
jdupes is a tuned fork of the well-known fdupes, and has Win32 releases.
Git
- GitHub Git Mirror Down
- Four ways to solve the "Remote Origin Already Exists" error.
-
So You Think You Know Git – Git Tips and Tricks by Scott Chacon
Boy, I can't find this either (but also, the kernel mailing list is _really_ difficult to search). I really remember Linus saying something like "it's not a real SCM, but maybe someone could build one on top of it someday" or something like that, but I cannot figure out how to find that.
You _can_ see, though, that in his first README, he refers to what he's building as not a "real SCM":
https://github.com/git/git/commit/e83c5163316f89bfbde7d9ab23...
- Maintain-Git.txt
-
Git Commit Messages by Jeff King
Here is the direct link, as HN somehow removes the query string: https://github.com/git/git/commits?author=peff&since=2023-10...
- Git commit messages by Jeff King
- My favourite Git commit (2019)
-
Do we think of Git commits as diffs, snapshots, and/or histories?
I understand all that.
I'm saying, if you write a survey and one of the possible answers is "diff", but you don't clearly define what you mean by "diff", then don't be surprised if respondents use any reasonable definition that makes sense to them. Ask an ambiguous question, get a mishmash of answers.
The thing that Git uses for packfiles is called a "delta" by Git, but it's also reasonable to call it a "diff". After all, Git's delta algorithm is "greatly inspired by parts of LibXDiff from Davide Libenzi"[1]. Not LibXDelta but LibXDiff.
Yes, how Git stores blobs (using deltas) is orthogonal to how Git uses blobs. But while that orthogonality is useful for reasoning about Git, it's not wrong to think of a commit as the totality of what Git does, including that optimization. (Some people, when learning Git, stumble over the way it's described as storing full copies, think it's wasteful. For them to wrap their heads around Git, they have to understand that the optimization exists. Which makes sense because Git probably wouldn't be practical if it lacked that optimization.)
The reason I'm bringing all this up is, if you're trying to explain Git, which is what the original article is about, then it's very important to keep in mind that someone who is learning Git needs to know what you mean when you say "diff". Most people who already know Git would tend to gravitate toward the definition of "diff" that you're assuming (the thing that Git computes on the fly and never stores), but people who already know Git aren't the target audience when you're teaching Git.
---
[1] https://github.com/git/git/blob/master/diff-delta.c
-
The State of Merging Technology
Didn't Git have a new default merge strategy, `ort` https://github.com/git/git/blob/master/Documentation/RelNote... ?
-
The bash book to rule them all
Yes, but you are referring to standalone scripts, not functions defined within a Bash script.
Compare for example the following helper code used for git command completion inside Bash and inside PowerShell.
Bash: https://github.com/git/git/blob/master/contrib/completion/gi...
What are some alternatives?
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
scalar - Scalar: A set of tools and extensions for Git to allow very large monorepos to run on Git without a virtualization layer
dupeguru - Find duplicate files
PineappleCAS - A generic computer algebra system targeted for the TI-84+ CE calculators
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
Subversion - Mirror of Apache Subversion
rdfind - find duplicate files utility
vscode-gitlens - Supercharge Git inside VS Code and unlock untapped knowledge within each repository — Visualize code authorship at a glance via Git blame annotations and CodeLens, seamlessly navigate and explore Git repositories, gain valuable insights via rich visualizations and powerful comparison commands, and so much more
czkawka - Multi functional app to find duplicates, empty folders, similar images etc.
linux - Linux kernel source tree
duperemove - Tools for deduping file systems
chromebrew - Package manager for Chrome OS [Moved to: https://github.com/chromebrew/chromebrew]