Detic
clip-fields
Detic | clip-fields | |
---|---|---|
11 | 2 | |
1,769 | 139 | |
1.0% | - | |
1.9 | 4.6 | |
about 1 month ago | 2 months ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Detic
-
Autodistill: A new way to create CV models
Some of the foundation/base models include: * GroundedSAM (Segment Anything Model) * DETIC * GroundingDINO
-
[P] Image search with localization and open-vocabulary reranking.
For localisation at search time I ended up using OWL-ViT. This worked really well. I did not try Detic or CLIPseg but would be interested to hear if anyone else has tried these?
-
training object detector using classified images?
git clone https://github.com/facebookresearch/Detic cd Detic pip install -r requirements python demo.py --config-file configs/Detic_LCOCOI21k_CLIP_SwinB_896b32_4x_ft4x_max-size.yaml --input desk.jpg --output out.jpg --vocabulary lvis --opts MODEL.WEIGHTS models/Detic_LCOCOI21k_CLIP_SwinB_896b32_4x_ft4x_max-size.pth
-
[P] Any object detection library
You might want to take a look at DETIC : https://github.com/facebookresearch/Detic (Open Vocabulary Object Detection, trained on thousands of classes)
-
[P] Awesome Image Segmentation Project Based on Deep Learning (5.6k star)
Are there any open-label segmentation model included in this repo, like Detic or LSeg?
-
[R] CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory + Code + Robot demo
We made this using pretty recent advances in web-data pretrained models like Detic and LSeg for detection, CLIP for visual queries, and Sentence BERT for semantic queries. Our "database" is really a neural field (Instant NGP) that maps from 3D coordinates to a high dimensional embedding vector in the same representation space as CLIP and SBERT.
-
[P] Using OpenAI's CLIP repository as a support, I was able to create a software to detect anything in an image at its original resolution!
Is it similar to the open vocabulary detic?
-
Researchers at Meta and the University of Texas at Austin Propose ‘Detic’: A Method to Detect Twenty-Thousand Classes using Image-Level Supervision
Code for https://arxiv.org/abs/2201.02605 found: https://github.com/facebookresearch/Detic
- Detecting Twenty-thousand Classes using Image-level Supervision
-
[R] Detecting Twenty-thousand Classes using Image-level Supervision
github: https://github.com/facebookresearch/Detic
clip-fields
-
[R] CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory + Code + Robot demo
Best part, I believe, is that you should be able to train your own CLIP-Field for your living room if you have an hour, a decent GPU, and a way to get RGB-D video (an iPhone 13 Pro works great!) I hope you can give the code a try: https://github.com/notmahi/clip-fields or check out the website https://mahis.life/clip-fields/ for more interactive demos. Our Arxiv submission is also out now, at https://arxiv.org/abs/2210.05663, and if you want a longer tl;dr with a couple more videos, check out this tweet. Thanks!
- Teaching robots to respond to queries with CLIP and NeRF-like neural fields
What are some alternatives?
GroundingDINO - Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
PoseNDF - Implementation of Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields
FasterRCNN - Clean and readable implementations of Faster R-CNN in PyTorch and TensorFlow 2 with Keras.
instant-ngp - Instant neural graphics primitives: lightning fast NeRF and more
ultralytics - NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite
marqo - Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
segment-anything - The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
lang-seg - Language-Driven Semantic Segmentation
clipseg - This repository contains the code of the CVPR 2022 paper "Image Segmentation Using Text and Image Prompts".
super-gradients - Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
yolov7 - Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors