LAVIS
PhotoPrism
LAVIS | PhotoPrism | |
---|---|---|
18 | 510 | |
8,781 | 32,687 | |
2.9% | 1.6% | |
6.3 | 9.9 | |
19 days ago | 5 days ago | |
Jupyter Notebook | Go | |
BSD 3-clause "New" or "Revised" License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LAVIS
- FLaNK AI for 11 March 2024
- FLaNK 04 March 2024
-
[D] Why is most Open Source AI happening outside the USA?
For multimodal, there's China (*many), then Salesforce.
-
Need help for a colab notebook running Lavis blip2_instruct_vicuna13b?
Been trying for all day to get a working inference for this example: https://github.com/salesforce/LAVIS/tree/main/projects/instructblip
-
most sane web3 job listing
There's also been big breakthroughs in computer vision. Not that long ago it was hard to recognize if a photo contained a bird; that's solved now by models like CLIP, Yolo, or Segment Anything. Now research has moved on to generating 3D scenes from images or interactively answering questions about images.
-
I work at a non-tech company and have been asked to make software that is impossible. How do I explain it to my boss?
The new hotness is multimodal vision-language models like InstructBLIP that can interactively answer questions about images. Check out the examples in the github repo, I would not have thought this was possible a few years ago.
-
Two-minute Daily AI Update (Date: 5/15/2023)
Salesforce’s BLIP family has a new member– InstructBLIP, a vision-language instruction-tuning framework using BLIP-2 models. It has achieved state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks, substantially outperforming BLIP-2 and Flamingo. (Source)
-
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Github
-
Can I use my own art as a training set?
Most of my workflows are self-made. For captioning I used Blip-2 in a custom script I made that automates the process by going into directories and their sub-directories and creates a .txt file beside each image. This way I can keep my images organized in their proper directories, without having to put dump them all in a single place.
- FLiP Stack Weekly for 13-Feb-2023
PhotoPrism
-
Show HN: Memories, FOSS Google Photos alternative built for high performance
I have been using https://www.photoprism.app for a couple of years, and it works better than expected, with the latest updates it's actually quite fast and the face tagging works reasonably well.
-
Ente: Open-Source, E2E Encrypted, Google Photos Alternative
For self-hosting, there's Photoprism[1] as well.
Ente's strength lies in end-to-end encryption[2] and its cloud[3] offering so you don't have to worry about reliability.
So if self-hosting is what you're after, Immich, Photoprism and Damselfly (TIL!) are perhaps better designed to serve your needs.
[1]: https://github.com/photoprism/photoprism
[2]: https://ente.io/architecture
[3]: https://ente.io/reliability
-
Switching to Android Was Easy
For quite a while I'm also in search for a solution which allows me to share galleries with my family, without having to ask them to jump through hoops in order to access them.
After some searching I'm now testing photoprism [1] which is a fantastic application, especially for self-hosting of photos. There's no mobile app for it (yet) and user-management is just starting to get implemented, but it shows alot of promise. Unfortunately not yet enough for putting it on the tablet of my granny but one can hope (and donate!)
Either way, I'm afraid that building a good mobile gallery app is an equally large task, after all the best solution would be to replace the users' native gallery-app with an equivalent that also supports custom Online-Galleries...
[1]: https://www.photoprism.app/
-
I write HTTP services in Go after 13 years (Mat Ryer, 2024)
out of curiosity, why no sort-of-established pkg and internal dirs? What do you think of https://github.com/photoprism/photoprism structure?
-
Escaping Surveillance Capitalism, at Scale
Thank you!
Ente was first a piece of hardware, then a self-host-able project, but we had a hard time monetizing both, which lead to the E2EE pivot.
TIL about TagSpaces, thanks!
Our server can be open-sourced, but we're unsure of the value E2EE will provide, with services like Photoprism[1] and Immich[2] already doing a good job of serving customers who prefer to self host. In this context E2EE might become a constraint, rather than a feature.
[1]: https://github.com/photoprism/photoprism
[2]: https://github.com/immich-app/immich
-
Google Photos alternative with OCR
Ive seen github issues like this one https://github.com/photoprism/photoprism/issues/907 in which it is implied that this is very very difficult.
- New Release 231128-f48ff16ef ⚙️🌈
-
Photo gallery frontend with encryption and search
Hi. I want to implement an image server similar to Photoprism using ImageAI to tag images based on objects and context. However I don't want to spend to much time working on the frontend, at first I were thinking about using Danbooru and use Flexbooru or the web interface on my phone. But it doesn't have any encryption or password protection (since the purpose of it is to be used as a public image board).
-
Suche Fotoverwaltungssoftware
https://www.photoprism.app in Docker.
-
Ask HN: How do you manage photos, philosophically?
PhotoPrism[0] and some ugly plumbing[1] to semantically tag all images in the gallery.
0: https://github.com/photoprism/photoprism
What are some alternatives?
pytorch-widedeep - A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Piwigo - Manage your photos with Piwigo, a full featured open source photo gallery application for the web. Star us on Github! More than 200 plugins and themes available. Join us and contribute!
CLIP-Caption-Reward - PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)
immich - High performance self-hosted photo and video management solution.
sparseml - Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
librephotos - A self-hosted open source photo management service. This is the repository of the backend.
robo-vln - Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Lychee - A great looking and easy-to-use photo-management-system you can run on your server, to manage and share photos.
DeepViewAgg - [CVPR'22 Best Paper Finalist] Official PyTorch implementation of the method presented in "Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation"
Photonix - A modern, web-based photo management server. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recognition, face recognition, location awareness, color analysis and other ML algorithms.
linkis - Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
Photoview - Photo gallery for self-hosted personal servers [Moved to: https://github.com/photoview/photoview]