DeepViewAgg
CapDec
DeepViewAgg | CapDec | |
---|---|---|
4 | 3 | |
215 | 170 | |
- | - | |
4.8 | 5.6 | |
9 months ago | 4 months ago | |
Python | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
DeepViewAgg
-
[R] [CVPR 2022 Oral] Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation
Code for https://arxiv.org/abs/2204.07548 found: https://github.com/drprojects/DeepViewAgg
- [R] [CVPR2022 Oral] Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation
- [CVPR 2022 Oral] Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation
CapDec
- Open source – Unsupervised captioning getting closer to supervised captioning
-
Reverse engineer Stable Diffusion images
Cool! I also how a project that does image captioning: https://github.com/DavidHuji/CapDec
- CapDec: SOTA Zero Shot Image Captioning Using Clip and GPT2
What are some alternatives?
torch-points3d - Pytorch framework for doing deep learning on point clouds.
mmf - A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
MonoScene - [CVPR 2022] "MonoScene: Monocular 3D Semantic Scene Completion": 3D Semantic Occupancy Prediction from a single image
pytorch-widedeep - A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
Pointnet_Pointnet2_pytorch - PointNet and PointNet++ implemented by pytorch (pure python) and on ModelNet, ShapeNet and S3DIS.
3DCoMPaT-v2 - 3DCoMPaT++: An improved large-scale 3D vision dataset for compositional recognition
img2dataset - Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
MAGIC - Language Models Can See: Plugging Visual Controls in Text Generation
LAVIS - LAVIS - A One-stop Library for Language-Vision Intelligence
x-clip - A concise but complete implementation of CLIP with various experimental improvements from recent papers