lantern
searcharray
lantern | searcharray | |
---|---|---|
5 | 4 | |
661 | 162 | |
8.3% | - | |
9.6 | 9.7 | |
6 days ago | 3 days ago | |
C | Python | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lantern
-
Are we at peak vector database?
Traditional DBs already kinda support vector DBs via pg_vector extensions and such.
There is a YC startup, latnern, that also built their own extension for postgres that is open source and is better for vector DB use cases: https://github.com/lanterndata/lantern
But yeah! Traditional DBs already support this, if you consider this extension to be part of Postgres.
-
90x Faster Than Pgvector β Lantern's HNSW Index Creation Time
This extension is licensed under the Business Source License[0], which makes it incompatible with most DBaaS offerings. The BSL is a closed-source license. Good choice for Lantern, but unusable for everyone else.
Some Postgres offerings allow you to bring your own extensions, for instance Neon[1], where I work. I tried to look at AWS docs for you, but couldn't find anything about that. I did find Trusted Language Extensions[2], but that seems to be more about writing your own extension. Couldn't find a way to upload arbitrary extensions.
[0]: https://github.com/lanterndata/lantern/commit/dda7f064ca80af...
-
Show HN: Lantern β a PostgreSQL vector database for building AI applications
Install and use our extension here` https://github.com/lanterndata/lantern
Features today + Coming soon
searcharray
-
A search engine in 80 lines of Python
This is really cool. I have a pretty fast BM25 search engine in Pandas I've been working on for local testing.
https://github.com/softwaredoug/searcharray
Why Pandas? Because BM25 is one thing, but you also want to combine with other factors (recency, popularity, etc) easily computed in pandas / numpy...
-
Are we at peak vector database?
You might be interested in
https://github.com/softwaredoug/searcharray
- SearchArray turns Pandas string columns into a term index
-
Show HN: SearchArray β Text Search in Pandas
I've long worked with Lucene based search engines like Solr and Elasticsearch. Anytime I need to experiment with relevance ranking in these systems, I'm exhausted by needing to set them up and work with something so disjoint from normal data tooling.
Further - the underlying ranking is buried in needless mystique (you know a boolean should query, sums the scores, right?). You shouldn't need to read a book (like Relevant Search ;) ) to unpack mystique that's really basic math.
Why not just let people build ranking systems with vectorized math in a numpy/pandas stack?
SearchArray lets anyone build a search prototype in Pandas. Typically building / experimenting with a smaller labeled dataset. If it works out, you can transfer it relatively easily to Elasticsearch or Solr for implementation.
SearchArray is a pandas extension array that creates an underlying search index for BM25 term/phrase based searching.
It's not quite done (will it ever be?) but its getting far enough along to be useful. So feedback is very welcome.
https://github.com/softwaredoug/searcharray
What are some alternatives?
vector-search-class-notes - Class notes for the course "Long Term Memory in AI - Vector Search and Databases" COS 597A @ Princeton Fall 2023
searx - Privacy-respecting metasearch engine [Moved to: https://github.com/searx/searx]
frameless - Expressive types for Spark.
PaddleNLP - π Easy-to-use and powerful NLP and LLM library with π€ Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including πText Classification, π Neural Search, β Question Answering, βΉοΈ Information Extraction, π Document Intelligence, π Sentiment Analysis etc.
usearch - Fast Open-Source Search & Clustering engine Γ for Vectors & π Strings Γ in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram π
lantern_extras - Routines for generating, manipulating, parsing, importing vector embeddings into Postgres tables
react-semantic-search