Java Panama Vector API Integrated with Apache Lucene

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • lucene

    Apache Lucene open-source search software

  • https://github.com/apache/lucene/issues/10047

    2. The Panama Vector API allows CPU's that support it to accelerate vector operations: https://openjdk.org/jeps/438

    So this allows fast ANN on Lucene for semantic search!

    How did people do this before Lucene supported it? Only through entirely different tools?

  • ann-benchmarks

    Benchmarks of approximate nearest neighbor libraries in Python

  • This will be a big deal if Lucene got competitive on http://ann-benchmarks.com if it became a serious alternative (and more holistic) than the vector databases.

    But it comes with continued challenges if I understand:

    - Panama is an incubating API and Java has taken its time having an official way of using SIMD

    - It only works on Java 20, with a very specific set of flags passed to the JVM. It’ll take time for this change to make it into Elasticsearch and Solr

    - Panama itself is a weird and very low level API.

    - Lucene organizes the HNSW vector index graph alongside its inverted index segments. And these need to be merged/compacted periodically. Merging HNSW graphs, as I understand it, is computationally difficult.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts