bu vs czkawka

bu

B)asic|But-For U)tility Code/Programs (in Nim & Often Unix/POSIX/Linux Context) (by c-blake)

Suggest topics

Source Code

Suggest alternative

Edit details

czkawka

Multi functional app to find duplicates, empty folders, similar images etc. (by qarmin)

Duplicates gtk-rs Rust Cleaner similar-images similar-music Multiplatform similar-videos

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

bu		czkawka
	Project
16	Mentions	361
52	Stars	17,595
-	Growth	-
9.2	Activity	7.7
6 days ago	Latest Commit	7 days ago
Nim	Language	Rust
MIT License	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

bu

Posts with mentions or reviews of bu. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-06.

Nim
5 projects | news.ycombinator.com | 6 Dec 2023
I think Nim is great for small CLIs. Some examples are over at: https://github.com/c-blake/bu . To quantify "small", using tools themselves in bu/ (and Zsh *):
```
    wc -l --total=never **.nim|cols 1|cstats ms q.05 q.95
```
fdupes: Identify or Delete Duplicate Files
13 projects | news.ycombinator.com | 2 Nov 2023

200 lines of Nim [1] seems to run about 9X faster than the 8000 lines of C in fdupes on a little test dir I have. If you need C, I think jdupes [2] is faster as @TacticalCoder points out a couple of times here. In my testing, `dups` is usually faster than `jdupes`, though.
[1] https://github.com/c-blake/bu/blob/main/dups.nim
[2] https://github.com/jbruchon/jdupes
Things I've learned about building CLI tools in Python
16 projects | news.ycombinator.com | 24 Oct 2023

You better off with using a compiled language.
If you interested in a language that's compiled, fast, but as easy and pleasant as Python - I'd recommend you take a look at [Nim](https://nim-lang.org).
And to prove what Nim's capable of - here's a cool repo with 100+ cli apps someone wrote in Nim: [c-blake/bu](https://github.com/c-blake/bu)
Removing Garbage Collection from the Rust Language (2013)
9 projects | news.ycombinator.com | 11 Sep 2023

20 milliseconds? On my 7 year old Linux box, this little Nim program https://github.com/c-blake/bu/blob/main/wsz.nim runs to completion in 275 microseconds when fully statically linked with musl libc on Linux. That's with a stripped environment (with `env -i`). It takes more like 318 microseconds with my usual 54 environment variables. The program only does about 17 system calls, though.
Additionally, https://github.com/c-blake/cligen makes decent CLI tools a real breeze. If you like some of Go's qualities but the language seems too limited, you might like Nim: https://nim-lang.org. I generally find getting good performance much less of a challenge with Nim, but Nim is undeniably less well known with a smaller ecosystem and less corporate backing.
The Awk book’s 60-line version of Make
2 projects | news.ycombinator.com | 10 Sep 2023
Often whole program generation in a prog.lang (& ecosystem!) that you already know can substitute for a new prog.lang. Python even has eval. You may be interested in: https://github.com/c-blake/bu/blob/main/doc/rp.md
You can actually get pretty far depending upon boundaries with the always implicit command-option language (when launched from the shell language, anyway). For example, Ben's example can be adapted to:
```
    rp -m^\[A-Za-z\] 'echo nr," ",s[1]'
```
Learn GNU Awk with hundreds of examples and exercises
4 projects | news.ycombinator.com | 28 Aug 2023

You might consider: https://github.com/c-blake/bu/blob/main/doc/cols.md
That's in Nim, though that may not be much a barrier. (There may also be other tools in bu/ of interest.)
GNU Parallel, where have you been all my life?
19 projects | news.ycombinator.com | 21 Aug 2023
This sounds like a job for what standard C calls "popen". You can do `import posix; for line in popen("ls", "r"): echo line` in Nim, though you obviously need to replace `echo line` with other desired processing and learn how to do that.
You might also want to consider `rp` which is a program generator-compiler-runner along the lines of `awk` but with all the code just Nim snippets interpolated into a program template: https://github.com/c-blake/bu/blob/main/doc/rp.md . E.g.:
```
    ls -l | rp -pimport\ stats -bvar\ r:RunningStat -wnf\>4 r.push\ 4.f -eecho\ r
```
The Bipolar Lisp Programmer
3 projects | news.ycombinator.com | 11 Aug 2023

Nim is terse yet general and can be made even more so with effort. E.g., You can gin up a little framework that is even more terse than awk yet statically typed and trivially convertible to run much faster like https://github.com/c-blake/bu/blob/main/doc/rp.md
You can statically introspect code to then generate related/translated ASTs to create nearly frictionless helper facilities like https://github.com/c-blake/cligen .
You can do all of this without any real run-time speed sacrifices, depending upon the level of effort you put in / your expertise. Since it generates C/C++ or Javascript you get all the abilities of backend compilers almost out of the box, like profile-guided-optimization or for JS JIT compilation.
Ask HN: Why did Nim not catch-on like wild fire as Rust did?
16 projects | news.ycombinator.com | 25 Jun 2023

I don't know about all your other questions, but the https://github.com/c-blake/cligen CLI framework seems much lower effort / ceremony than even Rust's `argh` and is just about as old as `clap` (both started 8 years ago in 2015).
There are over 50 CLI utilities at https://github.com/c-blake/bu, many of which do something novel rather than just "re-doing ls/find/cat with a twist". While they are really more an "ls/ps construction toolkits" with some default configs to get people going, I think https://github.com/c-blake/lc and https://github.com/c-blake/procs are nicer than Rust alternatives. I mention these since you seem interested in such tools.
Self Hosted SaaS Alternatives
17 projects | news.ycombinator.com | 5 Mar 2023

You are welcome. Thanks are too rarely offered. :-)
You may also be interested in word stemming ( such as used by snowball stemmer in https://github.com/c-blake/nimsearch ) or other NLP techniques, but I don't know how internationalized/multi-lingual that stuff is, but conceptually you might want "series of stemmed words" to be the content fragments of interest.
Similarity scores have many applications. Weights on graph of cancelled downloads ranked by size might be one. :)
Of course, for your specific "truncation" problem, you might also be able to just do an edit distance against the much smaller filenames and compare data prefixes in files or use a SHA256 of a content-based first slice. ( There are edit distance algos in Nim in https://github.com/c-blake/cligen/blob/master/cligen/textUt.... as well as in https://github.com/c-blake/suggest ).
Or, you could do a little program like ndup/sh/ndup to create a "mirrored file tree" of such content-based slices then you could use any true duplicate-file finder (like https://github.com/c-blake/bu/blob/main/dups.nim) on the little signature system to identify duplicates and go from path suffixes in those clusters back to the main filesystem. Of course, a single KV store within one or two files would be more efficient than thousands of tiny files. There are many possibilities.

czkawka

Posts with mentions or reviews of czkawka. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-09.

Is there software to compress large but similar files?
1 project | /r/DataHoarder | 11 Dec 2023
Merge three separate partial libraries from external USB drives
2 projects | /r/DataHoarder | 9 Dec 2023
Tools to deduplicate files
1 project | /r/DataHoarder | 7 Nov 2023

https://github.com/qarmin/czkawka by far the best of anything iv tried
fdupes: Identify or Delete Duplicate Files
13 projects | news.ycombinator.com | 2 Nov 2023

I've used Czkawka (https://github.com/qarmin/czkawka) because it does Lanczos-based image duplicate detection, which makes it more practical for me.
AllDup suddenly taking forever to process/delete selections
1 project | /r/DataHoarder | 29 Sep 2023

Maybe it's a setting you made or the files, not sure. You can try another software czkawka to see if you get better results with it.
Is there a file duplicate finder that works with animated jpegxl-gif?
1 project | /r/jpegxl | 26 Aug 2023

For static images i used https://github.com/qarmin/czkawka and it works well enough. I think. But when i used it on a folder with gifs and their jxl conversions, it shows nothing. SURELY this could not be user error, rrrright?
PhotoPrism: Browse Your Life in Pictures
17 projects | news.ycombinator.com | 11 Jul 2023

I used to use DupeGuru which has some photo-specific dupe detection where you can fuzzy match image dupes based on content: https://dupeguru.voltaicideas.net/
But I switched over to czkawka, which has a better interface for comparing files, and seems to be a bit faster: https://github.com/qarmin/czkawka
Unfortunately, neither of these are integrated into Photoprism, so you still have to do some file management outside the database before importing.
I also haven't used Photoprism extensively yet (I think it's running on one of my boxes, but I haven't gotten around to setting it up), but I did find that it wasn't really built for file-based libraries. It's a little more heavyweight, but my research shows that Nextcloud Memories might be a better choice for me (it's not the first-party Nextcloud photos app, but another one put together by the community): https://apps.nextcloud.com/apps/memories
Please don't post like 20 similar images to the art sites?
2 projects | /r/StableDiffusion | 8 Jul 2023

Czkawka can do this.
I'm amazed how I find anything & why I have so many dupes!
4 projects | /r/DataHoarder | 8 Jul 2023

There's always the well-respected tool, Czkawka. Or, of the CLI is your thing, jdupes is a good option.
I saw a post regarding crate to delete similar files
1 project | /r/rust | 7 Jul 2023

What are some alternatives?

When comparing bu and czkawka you can also consider the following projects:

NimForUE - Nim plugin for UE5 with native performance, hot reloading and full interop that sits between C++ and Blueprints. This allows you to do common UE workflows like for example to extend any UE class in Nim and extending it again in Blueprint if you wish so without restarting the editor. The final aim is to be able to do in Nim what you can do in C++

dupeguru - Find duplicate files

Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).

jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.

ordiri

fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.

OffensiveNim - My experiments in weaponizing Nim (https://nim-lang.org/)

AntiDupl - A program to search similar and defect pictures on the disk

awesome-selfhosted - A list of Free Software network services and web applications which can be hosted on your own servers

PhotoPrism - AI-Powered Photos App for the Decentralized Web 🌈💎✨

core - OPNsense GUI, API and systems backend

darktable - darktable is an open source photography workflow application and raw developer

bu vs NimForUE czkawka vs dupeguru bu vs Nim czkawka vs jdupes bu vs ordiri czkawka vs fdupes bu vs OffensiveNim czkawka vs AntiDupl bu vs awesome-selfhosted czkawka vs PhotoPrism bu vs core czkawka vs darktable

Compare bu vs czkawka and see what are their differences.

bu

czkawka

bu

czkawka

What are some alternatives?