Our great sponsors
-
txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
If you're interested in graphs + RAG and want an alternate approach, txtai has a semantic graph component.
https://neuml.hashnode.dev/introducing-the-semantic-graph
https://github.com/neuml/txtai
Disclaimer: I'm the primary author of txtai
The article is a good summary of RAG in the enterprise. It shed some light for me on the quality of building KG using LLMs, as recently, it is an approach that Neo4j was proposing [0].
According to the article, it is either costly (if using OpenAI), or slow using open source AI models. In both cases, predicting the quality of generated KG using LLMs is hard.
[0] https://github.com/neo4j/NaLLM
OpenNRE (https://github.com/thunlp/OpenNRE) is another good approach to neural relation extraction, though it's slightly dated. What would be particularly interesting is to combine models like OpenNRE or SpanMarker with entity-linking models to construct KG triples. And a solid, scalable graph database underneath would make for a great knowledge base that can be constructed from unstructured text.
By this I presume you mean build a search index that can retrieve results based on keywords? I know certain databases use Lucene to build a keyword-based index on top of unstructured blobs of data. Another alternative is to use Tantivy (https://github.com/quickwit-oss/tantivy), a Rust version of Lucene, if building search indices via Java isn't your cup of tea :)
Both libraries offer multilingual support for keywords, I believe, so that's a benefit to vector search where multilingual embedding models are rather expensive.
Related posts
- YaCy, a distributed Web Search Engine, based on a peer-to-peer network
-
SeekStorm VS tantivy - a user suggested alternative
2 projects | 22 Mar 2024
- Open Source Search Engine as an Alternative to Google Built in Spare Time
- StractOrg/stract: web search done right
- The Guy Building an Open-Source Google Search Competitor in His Spare Time