sbert

Open-source projects categorized as sbert

Top 6 sbert Open-Source Projects

  • mteb

    MTEB: Massive Text Embedding Benchmark

  • Project mention: AI for AWS Documentation | news.ycombinator.com | 2023-07-06

    RAG is very difficult to do right. I am experimenting with various RAG projects from [1]. The main problems are:

    - Chunking can interfer with context boundaries

    - Content vectors can differ vastly from question vectors, for this you have to use hypothetical embeddings (they generate artificial questions and store them)

    - Instead of saving just one embedding per text-chuck you should store various (text chunk, hypothetical embedding questions, meta data)

    - RAG will miserably fail with requests like "summarize the whole document"

    - to my knowledge, openAI embeddings aren't performing well, use a embedding that is optimized for question answering or information retrieval and supports multi language. Also look into instructor embeddings: https://github.com/embeddings-benchmark/mteb

    1 https://github.com/underlines/awesome-marketing-datascience/...

  • beir

    A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.

  • Project mention: On building a semantic search engine | news.ycombinator.com | 2024-01-06

    The BEIR project might be what you're looking for: https://github.com/beir-cellar/beir/wiki/Leaderboard

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • targetedSummarization

    TextReducer - A Tool for Summarization and Information Extraction

  • AnnA_Anki_neuronal_Appendix

    Using machine learning on your anki collection to enhance the scheduling via semantic clustering and semantic similarity

  • SBERT-for-Question-Answering-on-COVID-19-Dataset

    Sentence Bert for Question-Answering on COVID-19 Open Research Dataset (CORD-19)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

sbert related posts

  • Bing/ChatGPT browser can understand and summarize a 15-page PDF in seconds.

    2 projects | /r/singularity | 10 Feb 2023
  • Targeted Summarization - A tool for information extraction

    1 project | /r/LanguageTechnology | 25 Jan 2023
  • cool links

    1 project | /r/u_Walkier | 24 Aug 2021

Index

What are some of the best open-source sbert projects? This list will help you:

Project Stars
1 mteb 1,448
2 beir 1,407
3 bert-solr-search 161
4 targetedSummarization 87
5 AnnA_Anki_neuronal_Appendix 57
6 SBERT-for-Question-Answering-on-COVID-19-Dataset 3

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com