Median salaries, most in-demand technologies, state of the remote work... all you need to know your worth on the market by tech recruitment platform talent.io Learn more →
Top 6 locality-sensitive-hashing Open-Source Projects
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to diskProject mention: Bitmap Indexes in Go: Search Speed | news.ycombinator.com | 2022-09-22
Ducks, the story:
I was using Python in-memory vector search engine called Annoy  to do semantic search on various kinds of data. It worked great for finding "similar" objects. Story A has similar text to story B, image A looks like image B, etc.
But doing basic metadata lookups was surprisingly hard. How do I get all images matching some criteria (say, size range, or tags)? I'd have to serialize them all into a DB, and use a DB index. Databases are great, but they add code bloat and overhead; I'm usually working Jupyter notebooks and I like keeping as few external dependencies as possible.
So I wrote ducks as a quick, convenient way to index anything.
There's lots of other usage patterns of course, it's very generic. It makes a great Wordle / crossword solver too. "Find me words where the first letter is A and the fifth letter is L" is very fast in ducks.
Indexing is just one of those things you always need. Python didn't have a good way to do it, and now it does!
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble
Download talent.io’s Tech Salary Report. Median salaries, most in-demand technologies, state of the remote work... all you need to know your worth on the market by tech recruitment platform talent.io
Open source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.Project mention: [P] Is it feasible to find a mapping between two non-synthesized audio signals of the same audio sequence? | reddit.com/r/MachineLearning | 2022-08-21
Elasticsearch plugin and Lucene library for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.
Near-duplicate image detection using Locality Sensitive Hashing
Find duplicate text files.
locality-sensitive-hashing related posts
What do You Prefer?
1 project | reddit.com/r/comics | 14 Sep 2022
[D] Any example for Novalty detection in RGB image Dataset with Pytorch?
1 project | reddit.com/r/MLQuestions | 27 Aug 2022
Can anyone help me refine my model (music based CNN - artist recognition)?
1 project | reddit.com/r/learnmachinelearning | 9 Jun 2022
Spotify's annoy library vs Siamese Neural Networks for measuring similarity among songs and their artists?
1 project | reddit.com/r/learnmachinelearning | 5 Jun 2022
Is doing knn on the output of a music artist classifier a good way to build a system that suggests new music?
1 project | reddit.com/r/learnmachinelearning | 9 Mar 2022
Should we begin Linear Algebra with Matrices, or start with Vector Spaces?
1 project | reddit.com/r/math | 18 Jan 2022
NLP Method(s) for Finding Commonalities?
2 projects | reddit.com/r/LanguageTechnology | 13 Aug 2021
A note from our sponsor - talent.io
www.talent.io | 5 Oct 2022
What are some of the best open-source locality-sensitive-hashing projects? This list will help you:
Are you hiring? Post a new remote job listing for free.