czkawka
datacurator-filetree
czkawka | datacurator-filetree | |
---|---|---|
364 | 36 | |
20,515 | 1,500 | |
- | - | |
7.4 | 2.6 | |
about 2 months ago | 3 months ago | |
Rust | Makefile | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
czkawka
-
Ask HN: How do you deduplicate files?
You want content-addressed storage; this works with rolling content hashes that identify common blocks of memory. `rsync` uses that technique to minimize bytes to be transferred. https://github.com/qarmin/czkawka is a GUI app and CLI tool to find identical files in general and similar images in particular.
The task is much simpler if you only want to find bit-identical entire files, not part of files; in that case, you can just run a tool like `sha1sum` over each file and record the hash digest in a database; identical files—and only identical ones, with high probability—will have the same hash, non-identical ones will have different hashes.
- Czkawka: Multi functional app to find duplicates, empty folders, similar images
-
Duperemove – Tools for deduping file systems
You might be interested in this app: https://github.com/qarmin/czkawka
- Is there software to compress large but similar files?
- Merge three separate partial libraries from external USB drives
-
Tools to deduplicate files
https://github.com/qarmin/czkawka by far the best of anything iv tried
-
fdupes: Identify or Delete Duplicate Files
I've used Czkawka (https://github.com/qarmin/czkawka) because it does Lanczos-based image duplicate detection, which makes it more practical for me.
-
AllDup suddenly taking forever to process/delete selections
Maybe it's a setting you made or the files, not sure. You can try another software czkawka to see if you get better results with it.
-
Is there a file duplicate finder that works with animated jpegxl-gif?
For static images i used https://github.com/qarmin/czkawka and it works well enough. I think. But when i used it on a folder with gifs and their jxl conversions, it shows nothing. SURELY this could not be user error, rrrright?
-
PhotoPrism: Browse Your Life in Pictures
I used to use DupeGuru which has some photo-specific dupe detection where you can fuzzy match image dupes based on content: https://dupeguru.voltaicideas.net/
But I switched over to czkawka, which has a better interface for comparing files, and seems to be a bit faster: https://github.com/qarmin/czkawka
Unfortunately, neither of these are integrated into Photoprism, so you still have to do some file management outside the database before importing.
I also haven't used Photoprism extensively yet (I think it's running on one of my boxes, but I haven't gotten around to setting it up), but I did find that it wasn't really built for file-based libraries. It's a little more heavyweight, but my research shows that Nextcloud Memories might be a better choice for me (it's not the first-party Nextcloud photos app, but another one put together by the community): https://apps.nextcloud.com/apps/memories
datacurator-filetree
-
How do you store interest-based content? Do I store that content in separate filetype folders or a single folder with sub-directories for each media type?
For the most part I follow this file tree. However when it comes to some of my intererests, like electronics, I am unsure if I should keep splitting these interest-based files by file type, for example:
-
Where should I put my product "mockups" folder
I have redesigned my entire computer to follow the datacurator methodology: https://github.com/roboyoshi/datacurator-filetree/tree/main/root
-
Share your folder structure
P.S. I've been lurking this sub and have considered this particular problem for a long time and have read maybe everything Karl, Nayuki, Reddit, and Hacker News have had to say on the subject. Running into this post is a treat. If tags don't work out for you roboyoshi and contributors have made a really nice unified file tree https://github.com/roboyoshi/datacurator-filetree
-
I have created an Automated Screenshot Sorting in bash that moves screenshots from a folder into named subfolders in the screenshot's folder of Roboyoshi`s Datacurator Filetree.
As always, credit to u/Roboyoshi for the Datacurator filetree.
- What is your folder tree in Google Drive looks like?
-
Dataset Organisation.. Need Inspiration!
But it will obviously depend on the use case. As example you have JohnnyDecimal or a more simple approach
-
Tool to clone file structure without the large files themselves?
This tool will be useful to generate repos like these and sharing them with friends without actually needing to share them TB of data.
-
Tried to combine a few posts i saw on here
back in the days I started with this structure tho: https://github.com/roboyoshi/datacurator-filetree
- Beste Methode(n) zum organisieren von Dateien ?
-
My organisation structure; feedback appreciated
This is a mix of this post and https://github.com/roboyoshi/datacurator-filetree. Im still having trouble with a few things:
What are some alternatives?
dupeguru - Find duplicate files
filetags - Management of simple tags within file names
jdupes - A powerful duplicate file finder and an enhanced fork of 'fdupes'.
pyShelf - A simple terminal based ebook server
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
album-splitter - Split single-file MP3 albums into separate tracks. Downloads from YouTube supported.
AntiDupl - A program to search similar and defect pictures on the disk
appendfilename - Intelligent appending text to file names, considering file extensions and file tags
PhotoPrism - AI-Powered Photos App for the Decentralized Web 🌈💎✨
koreader - An ebook reader application supporting PDF, DjVu, EPUB, FB2 and many more formats, running on Cervantes, Kindle, Kobo, PocketBook and Android devices
darktable - darktable is an open source photography workflow application and raw developer
Kavita - Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with your friends and family.