Our great sponsors
-
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
From what I understand, what's most important to you is having a model that's already trained on something, rather than the architecture. Yolo is probably fine, as would be some of the older ones. You should be able to find a model that's been pretrained on COCO - you can look at see what classes are included. I don't know if there are other broadly trained models available that will serve your purpose. What I'd do is just run your picture through a COCO trained object detection model and see if the annotations do what you want.
Though backing up a bit, there are also image captioning models that may better do what you want to do for organizing your photos. I'm not really familiar with any - though I did come across BLIP the other day but I haven't used it: https://github.com/salesforce/BLIP
This may be a better way to get at what you want
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Related posts
- Is there a website where you can upload a photo and get the description in a paragraph?
- Stable Diffusion v2-1-unCLIP model released
- GPT-4 shows emergent Theory of Mind on par with an adult. It scored in the 85+ percentile for a lot of major college exams. It can also do taxes and create functional websites from a simple drawing
- meme
- Stable-diffusion in Nix