osxphotos
splink
osxphotos | splink | |
---|---|---|
96 | 16 | |
1,699 | 1,091 | |
- | 2.4% | |
9.4 | 9.9 | |
3 days ago | 1 day ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
osxphotos
-
Cleaning up my 200GB iCloud with some JavaScript
> Any method that I've found to clean them up (exporting the originals, deleting them from the library, and then re-importing the JPEGs only seems easiest) will lose all of the years of metadata that I've built up in the library.
The open source tool osxphotos (https://github.com/RhetTbull/osxphotos) can help with this. You can export the JPEG images while preserving metadata using the thrid-party exiftool utility:
`osxphotos export /path/to/export --has-raw --skip-raw --exiftool`
This exports all images that have a raw pair but skips the raw component then uses exiftool (https://exiftool.org/) to write the metadata (keywords, etc.) to the exported JPEG files. You can then re-import these into photos either by dragging them or by running `osxphotos import /path/to/export/*`
Both the export and import commands have many other options for controlling export directory, etc. `osxphotos help export` or `osxphotos docs` to open docs in browser. (Disclaimer: I'm the author of osxphotos)
-
pipx install osxphotos fails
See the issue tracker if you want to follow along. Hopefully this is an easy fix and I can push a patch today.
-
Delete empty albums
In response to a question on the osxphotos GitHub Discussions page, I wrote a quick script to do prune empty albums and folders from Photos that can be run with osxphotos (version 0.65.0 and later). You can run the script directly from GitHub without downloading it first via:
-
Library backup
You could try opening the library with PowerPhotos, a commercial app that can manage multiple Photos libraries, to see if it can read it. You could also try my free open source command line tool, osxphotos. Install it then run this command in the Terminal: osxphotos info --library /path/to/the/library This should print out a list of information about the library: number of photos, number of albums, keywords in the library, etc. If that works, then osxphotos can read the library and can likely export the photos for you so you could re-import into a new library.
-
Exploring EXIF
I'm the author of the osxphotos[0] tool mentioned in the article. For photos in an Apple Photos library, osxphotos gives you access to a rich set of metadata beyond what's in the actual EXIF/IPTC/XMP of the image. Apple performs object classification and other AI techniques on your images but generally doesn't expose this to the user. For example, photos are categorized as to object in them (dog, cat, breed of dog, etc.), rich reverse geolocation info (neighborhood, landmarks, etc.) and an interesting set of scores such as "overall aesthetic", "pleasant camera tilt", "harmonious colors", etc. These can be queried using osxphotos, either from the command line, or in your own python code. (Ref API docs[1])
For example, to find your "best" photos based on overall aesthetic score and add them to the album "Best Photos" you could run:
osxphotos query --query-eval "photo.score.overall > 0.8" --add-to-album "Best Photos"
To find good photos with trees in them you could try something like:
osxphotos query --query-eval "photo.score.overall > 0.5" --label Tree --add-to-album "Good Tree Photos"
There's quite a bit of other interesting data in Photos that you can explore with osxphotos. Run `osxphotos inspect` and it will show you all the metadata for whichever photo is currently selected in the Photos app.
[0] https://github.com/RhetTbull/osxphotos
-
Third Party Apps that work with Apple Photos Library
osxphotos is my own tool for power users to interact with Photos from the command line: export, batch edit, sync metadata, import, etc.
-
Alpha support for macOS Sonoma
osxphotos v0.60.8 adds initial alpha support for macOS Sonoma (macOS 14.0.0 / Photos 9.0). Everything seems to be working but if you are beta testing Sonoma and use osxphotos I'd welcome any feedback you have!
- How can I export my iCloud photo library to Amazon Photos on Mac OS?
-
Shared Library: Albums Aren’t Shared
I'm the author of the free/open source tool osxphotos which provides several utilities fo working with Photos and exporting your photos. You can use the batch-edit feature to automatically add the album name as a keyword and I believe keywords are shared across users. (I don't use shared libraries so can't confirm this). I am working on a feature to then automatically re-create the albums from the keywords on the target library. For now the keywords is a partial work around.
-
any program for MACOS or for Ubuntu that is free that allows you to edit the meta tags of photos en masse. Thanks!
If you want to edit batch metadata of photos that are in the Apple Photos app on a Mac, I'm the author of a free tool, osxphotos that includes a batch-edit command that will edit the metadata in the Photos library.
splink
- Splink: Fast, accurate, scalable probabilistic data linkage
-
Ask HN: What projects are you working on?
https://github.com/moj-analytical-services/splink
-
Record linkage/Entity linkage
Record linkage has been a big part of a project I've been working on for 6 months now. I personally think a great and free solution be using the splink package in Python which can handle 10+m rows which implements the Fellegi-Sunter model (equivalent to a naive-Bayes model) is the classical model in record linkage. It can be trained in an unsupervised manner using some initial parameter estimation (these are quite intuitive) and then expectation maximisation. The features in the model will be different pairwise string comparisons on your field of interest. These can include exact equality; edit distance comparisons like Levensthein distance and Jaro-Winkler; and phonetic comparisons like soundex and double metaphone. The splink pacakge will handle training the model and then all the graph theory at the end to connect all your links into clusters. All the details you'll need are in the links. https://www.robinlinacre.com/probabilistic\_linkage/ https://moj-analytical-services.github.io/splink/
-
What is the best approach to removing duplicate person records if the only identifier is person firstname middle name and last name? These names are entered in varying ways to the DB, thus they are free-fromatted.
https://moj-analytical-services.github.io/splink/ is a FOSS python package (but it runs against your db using SQL).
-
DuckDB – in-process SQL OLAP database management system
If you're curious, I've written a FOSS record linkage library that executes everything as SQL. It supports multiple SQL backends including DuckDB and Spark for scale, and runs faster than most competitors because it's able to leverage the speed of these backends: https://github.com/moj-analytical-services/splink
-
Ask HN: What have you created that deserves a second chance on HN?
Splink - a python library for probabilistic record linkage (fuzzy matching/entity resolution).
Splink is dramatically faster and works on much larger datasets than other open source libraries. I'm particularly proud of the fact we support multiple execution backends (at the moment, DuckDb Spark Athena and Sqlite, but additional adaptors are relatively straightforward to write).
We've had >4 million pypi downloads and it's used in government, academia and the private sector, often replacing extremely expensive proprietary solutions.
https://github.com/moj-analytical-services/splink
More info in blog posts here:
-
Conformed Dimensions problem that keeps recurring on every project
Splink is a SQL tool that can do this https://github.com/moj-analytical-services/splink
-
How do you join two sources with attributes that aren't identical?
Probabilistic record matching model such as a Fellegi-Sunter. Check out the splink package in Python.
-
Splink 3: Fast, accurate and scalable record linkage (entity resolution) in Python
Main docs here: https://moj-analytical-services.github.io/splink
-
Splink 3: Fast, accurate and scalable fuzzy record linkage in Python with support for multiple backends (FOSS)
It'd be great to see Splink add value in this area! Do give us a shout if you have any questions. The best place to post is on the Github discussions: https://github.com/moj-analytical-services/splink/discussions
What are some alternatives?
exiftool - ExifTool meta information reader/writer
zingg - Scalable identity resolution, entity resolution, data mastering and deduplication using ML
icloud-drive-docker - Dockerized iCloud Client - make a local copy of your iCloud documents and photos, and keep it automatically up-to-date.
dedupe - :id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
photos_time_warp - Batch adjust the date, time, or timezone of photos in Apple Photos from the Mac command line.
libpostal - A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
icloud_photos_downloader - A command-line tool to download photos from iCloud
sqlglot - Python SQL Parser and Transpiler
ipyflow - A reactive Python kernel for Jupyter notebooks.
entity-embed - PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Home Assistant - :house_with_garden: Open source home automation that puts local control and privacy first.
dblink - Distributed Bayesian Entity Resolution in Apache Spark