lofi-dx
searcharray
lofi-dx | searcharray | |
---|---|---|
2 | 4 | |
7 | 162 | |
- | - | |
8.6 | 9.7 | |
2 days ago | 5 days ago | |
TypeScript | Python | |
ISC License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lofi-dx
-
A search engine in 80 lines of Python
Hey, I tackled phrase matching in my toy project here: https://github.com/vasilionjea/lofi-dx/blob/main/test/search...
I think I tested it thoroughly but any feedback would be appreciated!
searcharray
-
A search engine in 80 lines of Python
This is really cool. I have a pretty fast BM25 search engine in Pandas I've been working on for local testing.
https://github.com/softwaredoug/searcharray
Why Pandas? Because BM25 is one thing, but you also want to combine with other factors (recency, popularity, etc) easily computed in pandas / numpy...
-
Are we at peak vector database?
You might be interested in
https://github.com/softwaredoug/searcharray
- SearchArray turns Pandas string columns into a term index
-
Show HN: SearchArray β Text Search in Pandas
I've long worked with Lucene based search engines like Solr and Elasticsearch. Anytime I need to experiment with relevance ranking in these systems, I'm exhausted by needing to set them up and work with something so disjoint from normal data tooling.
Further - the underlying ranking is buried in needless mystique (you know a boolean should query, sums the scores, right?). You shouldn't need to read a book (like Relevant Search ;) ) to unpack mystique that's really basic math.
Why not just let people build ranking systems with vectorized math in a numpy/pandas stack?
SearchArray lets anyone build a search prototype in Pandas. Typically building / experimenting with a smaller labeled dataset. If it works out, you can transfer it relatively easily to Elasticsearch or Solr for implementation.
SearchArray is a pandas extension array that creates an underlying search index for BM25 term/phrase based searching.
It's not quite done (will it ever be?) but its getting far enough along to be useful. So feedback is very welcome.
https://github.com/softwaredoug/searcharray
What are some alternatives?
searchkit - Search UI for Elasticsearch & Opensearch. Compatible with Algolia's Instantsearch and Autocomplete components. React & Vue support
searx - Privacy-respecting metasearch engine [Moved to: https://github.com/searx/searx]
router - π€ Fully typesafe Router for React (and friends) w/ built-in caching, 1st class search-param APIs, client-side cache integration and isomorphic rendering.
PaddleNLP - π Easy-to-use and powerful NLP and LLM library with π€ Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including πText Classification, π Neural Search, β Question Answering, βΉοΈ Information Extraction, π Document Intelligence, π Sentiment Analysis etc.
orama - π Fast, dependency-free, full-text and vector search engine with typo tolerance, filters, facets, stemming, and more. Works with any JavaScript runtime, browser, server, service!