New free tool that uses fine-tuned BERT model to surface answers from research papers

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • elasticsearch-learning-to-rank

    Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch

  • I worked on a learning-to-rank problem at a previous job (which unfortunately never got deployed womp, womp). This was early days, so at the time I was looking at using LambdaMART with solr or elasticsearch for reranking with a Bayesian click model to get pseudo-labels for relevance.

  • similarity

    TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

  • Tensorflow Ranking and Tensorflow similarity (maybe relevant/irrelevant contrastive learning?) look like they could be useful.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ColBERT

    ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

  • ColBERT and successors for retrieval.

  • haystack

    :mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

  • Some cool tools like HayStack that would be useful in putting some of these together.

  • Milvus

    A cloud-native vector database, storage for next generation AI applications

  • And of course could do some sort of vector search engine like Milvus with nearest neighbors on embeddings.

    Some good papers here.

  • qdrant

    Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

  • https://github.com/qdrant/qdrant oh nice, very cool that it's open source and all in Rust. Milvus is I think mostly Go, but it has some parts in C++.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts