Python neural-search

Open-source Python projects categorized as neural-search

Top 15 Python neural-search Projects

  • jina

    ☁️ Build multimodal AI applications with cloud-native stack

  • Project mention: Jina.ai: Self-host Multimodal models | news.ycombinator.com | 2024-01-26
  • clip-as-service

    🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

  • Project mention: Search for anything ==> Immich fails to download textual.onnx | /r/immich | 2023-09-15
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • PaddleNLP

    👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.

  • txtai

    💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

  • Project mention: Build knowledge graphs with LLM-driven entity extraction | dev.to | 2024-02-21

    txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

  • dalle-flow

    🌊 A Human-in-the-Loop workflow for creating HD images from text

  • docarray

    Represent, send, store and search multimodal data

  • Project mention: DocArray – Represent, send, and store multimodal data for ML | news.ycombinator.com | 2023-04-27
  • finetuner

    :dart: Task-oriented embedding tuning for BERT, CLIP, etc.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • refinery

    The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.

  • mteb

    MTEB: Massive Text Embedding Benchmark

  • Project mention: AI for AWS Documentation | news.ycombinator.com | 2023-07-06

    RAG is very difficult to do right. I am experimenting with various RAG projects from [1]. The main problems are:

    - Chunking can interfer with context boundaries

    - Content vectors can differ vastly from question vectors, for this you have to use hypothetical embeddings (they generate artificial questions and store them)

    - Instead of saving just one embedding per text-chuck you should store various (text chunk, hypothetical embedding questions, meta data)

    - RAG will miserably fail with requests like "summarize the whole document"

    - to my knowledge, openAI embeddings aren't performing well, use a embedding that is optimized for question answering or information retrieval and supports multi language. Also look into instructor embeddings: https://github.com/embeddings-benchmark/mteb

    1 https://github.com/underlines/awesome-marketing-datascience/...

  • primeqa

    The prime repository for state-of-the-art Multilingual Question Answering research and development.

  • Project mention: State-of-the-Art Multilingual Question Answering | /r/aiengineer | 2023-07-10
  • vectordb

    A Python vector database you just need - no more, no less. (by jina-ai)

  • Project mention: A Python Vector Database | news.ycombinator.com | 2023-08-13
  • cherche

    Neural Search

  • Project mention: [P] Semantic search | /r/MachineLearning | 2023-05-08

    If you are interested, you can check out the documentation here: https://github.com/raphaelsty/cherche

  • neural-cherche

    Neural Search

  • Project mention: [P] Introducing Neural-Cherche: Enhance Document Retrieval with Advanced AI Models | /r/MachineLearning | 2023-11-19

    I'm excited to share a tool I've developed called Neural-Cherche. Its main purpose is to transform a Sentence Transformer into a ColBERT model, which is currently at the forefront of information retrieval tools.

  • weaviate-txtai

    An integration of the weaviate vector search engine with txtai

  • Project mention: External database integration | dev.to | 2023-09-07

    As mentioned previously, all of the main components of txtai can be replaced with custom components. For example, there are external integrations for storing dense vectors in Weaviate and Qdrant to name a few.

  • AquilaHub

    Load and serve Neural Encoder Models

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-02-21.

Python neural-search related posts

Index

What are some of the best open-source neural-search projects in Python? This list will help you:

Project Stars
1 jina 19,884
2 clip-as-service 12,169
3 PaddleNLP 11,386
4 txtai 6,910
5 dalle-flow 2,824
6 docarray 2,730
7 finetuner 1,423
8 refinery 1,358
9 mteb 1,314
10 primeqa 696
11 vectordb 460
12 cherche 311
13 neural-cherche 291
14 weaviate-txtai 7
15 AquilaHub 2
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com