searcharray

Full text search in your Pandas dataframe (by softwaredoug)

Searcharray Alternatives

Similar projects and alternatives to searcharray

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better searcharray alternative or higher similarity.

searcharray reviews and mentions

Posts with mentions or reviews of searcharray. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-07.
  • A search engine in 80 lines of Python
    6 projects | news.ycombinator.com | 7 Feb 2024
    This is really cool. I have a pretty fast BM25 search engine in Pandas I've been working on for local testing.

    https://github.com/softwaredoug/searcharray

    Why Pandas? Because BM25 is one thing, but you also want to combine with other factors (recency, popularity, etc) easily computed in pandas / numpy...

  • Are we at peak vector database?
    8 projects | news.ycombinator.com | 25 Jan 2024
    You might be interested in

    https://github.com/softwaredoug/searcharray

  • SearchArray turns Pandas string columns into a term index
    1 project | news.ycombinator.com | 27 Dec 2023
  • Show HN: SearchArray – Text Search in Pandas
    1 project | news.ycombinator.com | 19 Nov 2023
    I've long worked with Lucene based search engines like Solr and Elasticsearch. Anytime I need to experiment with relevance ranking in these systems, I'm exhausted by needing to set them up and work with something so disjoint from normal data tooling.

    Further - the underlying ranking is buried in needless mystique (you know a boolean should query, sums the scores, right?). You shouldn't need to read a book (like Relevant Search ;) ) to unpack mystique that's really basic math.

    Why not just let people build ranking systems with vectorized math in a numpy/pandas stack?

    SearchArray lets anyone build a search prototype in Pandas. Typically building / experimenting with a smaller labeled dataset. If it works out, you can transfer it relatively easily to Elasticsearch or Solr for implementation.

    SearchArray is a pandas extension array that creates an underlying search index for BM25 term/phrase based searching.

    It's not quite done (will it ever be?) but its getting far enough along to be useful. So feedback is very welcome.

    https://github.com/softwaredoug/searcharray

  • A note from our sponsor - InfluxDB
    www.influxdata.com | 1 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more β†’

Stats

Basic searcharray repo stats
4
159
9.7
4 days ago

softwaredoug/searcharray is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of searcharray is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com