simsimd vs np-sims

simsimd

By unum-cloud

Suggest topics

DISCONTINUED

Suggest alternative

Edit details

np-sims

numpy ufuncs for vector similarity (by softwaredoug)

Suggest topics

Source Code

Suggest alternative

Edit details

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

simsimd		np-sims
	Project
1	Mentions	2
-	Stars	14
-	Growth	-
-	Activity	8.6
-	Latest Commit	6 months ago
	Language	Python
-	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

simsimd

Posts with mentions or reviews of simsimd. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-23.

Show HN: Fast Vector Similarity Using Rust and Python
8 projects | news.ycombinator.com | 23 Aug 2023

It’s a good start, but you can’t generally get even remotely close to hardware potential in Rust, let alone Python.
I had to implement a separate C99 library to always trigger the newest SIMD intrinsics, occasionally leveraging SVE on more recent ARM CPUs, that compilers don’t know how to generate.
That library is in turn used in USearch, which is designed for Approximate Search, but some users recently reported that they use it for brute force as well… where it performed 20x faster than FAISS.
https://github.com/unum-cloud/simsimd

np-sims

Posts with mentions or reviews of np-sims. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-30.

Approximate Nearest Neighbors Oh Yeah
5 projects | news.ycombinator.com | 30 Oct 2023

I implemented this recently in C as a numpy extension[1], for. fun. Even had a vectorized solution going.
You'll get diminishing returns on recall pretty fast. There's actually a theorem that tracks this - Jordan-Lindenstrauss lemma[2] if you're interested.
As I mention in a talk I gave[3], it can work if you're going to rerank anyway. And whatever vector search thing isn't the main ranking signal. It's also easy to update, as the hashes are non-parametric (they don't depend on the data).
The lack of data-dependency, however is the main problem. Vector spaces are lumpy. You can see this in the distribution of human beings on the surface of the earth - postal codes and area codes vary from small to huge - random hashes, like a grid, wouldn't let you accurately map out the distribution of all the people or clump them close to their actual nearest neighbors. Manhattan is not rural Alaska.
Annoy, actually, builds on these hashes, by creating many trees of such hashes, and then finds a split in the left and right. Then in creates a forest of such trees. So its essentially a forest of random hash trees with data dependency.
Hope that helps.
1 - https://github.com/softwaredoug/np-sims
Show HN: Fast Vector Similarity Using Rust and Python
8 projects | news.ycombinator.com | 23 Aug 2023

Nice!
I recently implemented a C-based numpy solution of LSH to compress / recover cosine similarity[1]. It was my first time writing Numpy C, and it was a lot of fun to massively improve the performance over pure Python[2].
1- https://github.com/softwaredoug/np-sims
2- https://softwaredoug.com/blog/2023/08/22/rand-projections-in...

What are some alternatives?

When comparing simsimd and np-sims you can also consider the following projects:

swiss_army_llama - A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

DoctorGPT - 💻📚💡 DoctorGPT provides advanced LLM prompting for PDFs and webpages.

fast_vector_similarity - The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.

llama_embeddings_fastap

qdrant - Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

simsimd vs swiss_army_llama np-sims vs swiss_army_llama simsimd vs DoctorGPT np-sims vs fast_vector_similarity simsimd vs llama_embeddings_fastap np-sims vs qdrant simsimd vs fast_vector_similarity np-sims vs DoctorGPT simsimd vs qdrant np-sims vs llama_embeddings_fastap

Compare simsimd vs np-sims and see what are their differences.

simsimd

np-sims

simsimd

np-sims

What are some alternatives?