word2vec
groupImg
Our great sponsors
word2vec | groupImg | |
---|---|---|
3 | 2 | |
1,490 | 222 | |
- | - | |
0.0 | 4.9 | |
about 1 year ago | 27 days ago | |
C | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
word2vec
-
Is Cosine-Similarity of Embeddings Really About Similarity?
The original paper included source, and that has their test data and results -- it gets ~77% accuracy on about 20k example word analogies (with 99.7% coverage), and 78% accuracy with phrases with 77% coverage. You can see the test set here:
https://github.com/tmikolov/word2vec/blob/master/questions-w...
-
Introduction to K-Means Clustering
It is not necessarily the case.
For example, word2vec uses k-means clustering using cosine similarity measure [1]. It works very, very well. The caveat is not many optimization variations of k-means will work with that "distance".
[1] https://github.com/tmikolov/word2vec/blob/master/word2vec.c#...
groupImg
-
Introduction to K-Means Clustering
If anyone is interested, I have two projects that uses k-means
https://github.com/victorqribeiro/groupImg
https://github.com/victorqribeiro/budget
Being one of the first ML algorithms that I learned, I spend some time finding use cases for it
If I'm not mistaken I've also used in to classify deforestation in an exercise
-
Show HN: I made NIGHT.FM, a cyberpunk-inspired online radio
well, I took a class back in college with an Old School professor. He's the one who drove the class that way. I just enjoyed the process. But there are a lot of tutorials on the internet about writing your own NN. I think the first algorithm that I ever wrote regarding ML was k-means [1]. Start there and see where it takes you:
https://en.wikipedia.org/wiki/K-means_clustering
Look at this project I have used it:
https://github.com/victorqribeiro/groupImg
What are some alternatives?
hdbscan - A high performance implementation of HDBSCAN clustering.
m2cgen - Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies
ckwrap - Wrapper for Ckmeans.1d.dp.
labelme2coco - A lightweight package for converting your labelme annotations into COCO object detection format.
RobotEyes - Image comparison for Robot Framework
tslearn - The machine learning toolkit for time series analysis in Python
albumentations - Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
img2cmap - Create colormaps from images
differences-between-two-images - Detect and visualize differences between two images with OpenCV Python