dino
unsupervised-depth-completion-visual-inertial-odometry
dino | unsupervised-depth-completion-visual-inertial-odometry | |
---|---|---|
7 | 2 | |
6,697 | 190 | |
2.4% | - | |
0.0 | 5.0 | |
9 months ago | over 1 year ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dino
- Batch-wise processing or image-by-image processing? (DINO V1)
-
[P] Image search with localization and open-vocabulary reranking.
I also implemented one based on the self attention maps from the DINO trained ViT’s. This worked pretty well when the attention maps were combined with some traditional computer vision to get bounding boxes. It seemed an ok compromise between domain specialization and location specificity. I did not try any saliency or gradient based methods as i was not sure on generalization and speed respectively. I know LAVIS has an implementation of grad cam and it seems to work well in the plug'n'play vqa.
-
Unsupervised semantic segmentation
You will probably need an unwieldy amount of data and compute to reproduce it, so your best option would be to use the pretrained models available on github.
-
[D] Why Transformers are taking over the Compute Vision world: Self-Supervised Vision Transformers with DINO explained in 7 minutes!
[Full Explanation Post] [Arxiv] [Project Page]
-
A major part of real-world AI has to be solved to make unsupervised, generalized full self-driving work, as the entire road system is designed for biological neural nets with optical imagers
Except he is actually talking about the new DINO model created by facebook that was released on friday. Which is a new approach to image transformers for unsupervised segmentation. Here's its github.
-
[D] Paper Explained - DINO: Emerging Properties in Self-Supervised Vision Transformers (Full Video Analysis)
Code: https://github.com/facebookresearch/dino
- [R] DINO and PAWS: Advancing the state of the art in computer vision with self-supervised Transformers
unsupervised-depth-completion-visual-inertial-odometry
-
Unsupervised Depth Completion from Visual Inertial Odometry
Hey there, interested in camera and range sensor fusion for point cloud (depth) completion?
Here is an extended version of our [talk](https://www.youtube.com/watch?v=oBCKO4TH5y0) at ICRA 2020 where we do a step by step walkthrough of our paper Unsupervised Depth Completion from Visual Inertial Odometry (joint work with Fei Xiaohan, Stephanie Tsuei, and Stefano Soatto).
In this talk, we present an unsupervised method (no need for human supervision/annotations) for learning to recover dense point clouds from images, captured by cameras, and sparse point clouds, produced by lidar or tracked by visual inertial odometry (VIO) systems. To illustrate what I mean, here is an [example](https://github.com/alexklwong/unsupervised-depth-completion-visual-inertial-odometry/blob/master/figures/void_teaser.gif?raw=true) of the point clouds produced by our method.
Our method is light-weight (so you can run it on your computer!) and is built on top of [XIVO] (https://github.com/ucla-vision/xivo) our VIO system.
For those interested here are links to the [paper](https://arxiv.org/pdf/1905.08616.pdf), [code](https://github.com/alexklwong/unsupervised-depth-completion-visual-inertial-odometry) and the [dataset](https://github.com/alexklwong/void-dataset) we collected.
-
[N][R] ICRA 2020 extended talk for Unsupervised Depth Completion from Visual Inertial Odometry
In this talk, we present an unsupervised method (no need for human supervision/annotations) for learning to recover dense point clouds from images, captured by cameras, and sparse point clouds, produced by lidar or tracked by visual inertial odometry (VIO) systems. To illustrate what I mean, you can visit our github page for examples (gifs) of point clouds produced by our method.
What are some alternatives?
pytorch-metric-learning - The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
calibrated-backprojection-network - PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)
simsiam-cifar10 - Code to train the SimSiam model on cifar10 using PyTorch
simclr - SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners
lightly - A python library for self-supervised learning on images.
bpycv - Computer vision utils for Blender (generate instance annoatation, depth and 6D pose by one line code)