YOLO-World
FLiPStackWeekly
YOLO-World | FLiPStackWeekly | |
---|---|---|
3 | 86 | |
3,688 | 14 | |
8.0% | - | |
9.1 | 9.9 | |
7 days ago | 5 days ago | |
Python | ||
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
YOLO-World
-
A History of CLIP Model Training Data Advances
2024 is shaping up to be the year of multimodal machine learning. From real-time text-to-image models and open-world vocabulary models to multimodal large language models like GPT-4V and Gemini Pro Vision, AI is primed for an unprecedented array of interactive multimodal applications and experiences.
- FLaNK Stack Weekly 19 Feb 2024
-
Making My Bookshelves Clickable
Post author here. I like this idea. I plan to explore it and make a more generic solution. I'd love to have a point-and-click interface for annotating scenes.
For example, I'd like to be able to click on pieces of coffee equipment in a photo of my coffee setup so I can add sticky note annotations when you hover over each item.
For the bookshelves idea specifically, I would love to have a correction system in place. The problem isn't so much SAM as it is Grounding DINO, the model I'm using for object identification. I then pass each identified region to SAM and map the segmentation mask to the box.
Grounding DINO detects a lot of book spines, but often misses 1-2. I am planning to try out YOLO-World (https://github.com/AILab-CVC/YOLO-World), which, in my limited testing, performs better for this task.
FLiPStackWeekly
What are some alternatives?
gorilla-cli - LLMs for your CLI
awk-raycaster - Pseudo-3D shooter written completely in gawk using raycasting technique
litellm - Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
modelscope - ModelScope: bring the notion of Model-as-a-Service to life.
pulsar-thermal-pinot - Apache Pulsar - Apache Pinot - Thermal Sensor Data
FLiP-PulsarSummit2022Asia - FLiP-PulsarSummit2022Asia: Pulsar Summit Asia 2022
sherlock - Hunt down social media accounts by username across social networks
create-nifi-pulsar-flink-apps - How to create a real-time scalable streaming app using Apache NiFi, Apache Pulsar and Apache Flink SQL
OpenVoice - Instant voice cloning by MyShell.
CML_AMP_LLM_Chatbot_Augmented_with_Enterprise_Data
VToonify - [SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
Tabby - A terminal for a more modern age