RHO-Loss
google-research
Our great sponsors
RHO-Loss | google-research | |
---|---|---|
1 | 74 | |
143 | 27,994 | |
3.5% | 3.8% | |
5.5 | 9.8 | |
6 months ago | 5 days ago | |
Python | Jupyter Notebook | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
RHO-Loss
-
[D] Most important AI Paper´s this year so far in my opinion + Proto AGI speculation at the end
RHO-LOSS - Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt - Trains Models 18x faster with higher accuracy Paper: https://arxiv.org/abs/2206.07137 Github: https://github.com/OATML/RHO-Loss
google-research
-
Train custom AI models on spreadsheet data with just a few clicks
For the purposes of demonstration, we trained OpenAI's Babbage model on Google's GoEmotions dataset which classifies emotions from 58k Reddit comments.
Hey u/habylab, you're right — I mentioned in my original comment that this was an example of training a model on the GoEmotions dataset from a spreadsheet.
-
Run Clip on iPhone to Search Photos
Nice blog post.
I wonder if it's possible to speed up the search with something like https://github.com/google-research/google-research/tree/mast...
Also kind of surprising that something like this is not officially supported already! In my books that means this is a Good Idea
- Deep Learning Pioneer Geoffrey Hinton Publishes New Deep Learning Algorithm
-
[Discussion] NLP for products matching
Plus the graph posted there is rather self explanatory. Also it gives you names of competing libraries and their benchmarks. As you can see ScaNN is the best so far, but I use annoy since its speed is sufficient for me (I usually need to match around 10k strings to 80k strings) and it's usage is very simple and straightforward.
-
Nearest-neighbor search in high-dimensional spaces
Don't roll your own solution, use ScaNN (https://github.com/google-research/google-research/tree/master/scann) or Faiss (https://github.com/facebookresearch/faiss). I used the internal version of ScaNN while I was at Google, and found it incredibly well put-together. Can't speak to the open-source version, but it should be similarly good. These might be a bit overkill given your set sizes, but it'll be easier than building your own fix.
-
The Vector Database Index: Who, what, why now, & how
https://github.com/google-research/google-research/tree/mast...
We use ScaNN for a large scale/performant neural search. Otherwise this all feels bloated.
- Apprendre Python, de zéro
-
[D] Most important AI Paper´s this year so far in my opinion + Proto AGI speculation at the end
An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems - Google 2022 – Pathways - Jeff Dean! - Network grows with amount of tasks and data! Paper: https://arxiv.org/abs/2205.12755 Github: https://github.com/google-research/google-research/tree/master/muNet
What are some alternatives?
qdrant - Qdrant - Vector Search Engine and Database for the next generation of AI applications. Also available in the cloud https://cloud.qdrant.io/
Milvus - A cloud-native vector database with high-performance and high scalability.
fast-soft-sort - Fast Differentiable Sorting and Ranking
struct2depth - Models and examples built with TensorFlow
faiss - A library for efficient similarity search and clustering of dense vectors.
ml-agents - The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
ML-KWS-for-MCU - Keyword spotting on Arm Cortex-M Microcontrollers
rmi - A learned index structure
torchsort - Fast, differentiable sorting and ranking in PyTorch
CLIP - CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
t5x
DALLE-mtf - Open-AI's DALL-E for large scale training in mesh-tensorflow.