Top 3 C# Embedding Projects
-
Catalyst
🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models. (by curiosity-ai)
-
umap-sharp
C# library for fast embeddings projection using Uniform Manifold Approximation and Projection
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: Pg_vectorize: The simplest way to do vector search and RAG on Postgres | news.ycombinator.com | 2024-03-06I wrote a C# library to do this, which is similar to other chunking approaches that are common, like the way langchain does it: https://github.com/drittich/SemanticSlicer
Given a list of separators (regexes), it goes through them in order and keeps splitting the text by them until the chunk fits within the desired size. By putting the higher level separators first (e.g., for HTML split by
before
), it's a pretty good proxy for maintaining context.
C# Embeddings related posts
Index
What are some of the best open-source Embedding projects in C#? This list will help you:
Project | Stars | |
---|---|---|
1 | Catalyst | 679 |
2 | umap-sharp | 32 |
3 | SemanticSlicer | 7 |
Sponsored