Show Show HN: Llama2 Embeddings FastAPI Server

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • swiss_army_llama

    A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

  • Thanks for pointing out those models. I see from a quick Huggingface search that the bge model is available in GGML format. You can trivially add new GGML format models to the code by simply adding the direct download link to this line:

    https://github.com/Dicklesworthstone/llama_embeddings_fastap...

    So to add the base bge model, you could just add this URL to the list:

    https://huggingface.co/maikaarda/bge-base-en-ggml/resolve/ma...

    I will add that as an additional default.

  • Thanks for pointing out those models. I see from a quick Huggingface search that the bge model is available in GGML format. You can trivially add new GGML format models to the code by simply adding the direct download link to this line:

    https://github.com/Dicklesworthstone/llama_embeddings_fastap...

    So to add the base bge model, you could just add this URL to the list:

    https://huggingface.co/maikaarda/bge-base-en-ggml/resolve/ma...

    I will add that as an additional default.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • serve

    Serve, optimize and scale PyTorch models in production (by pytorch)

  • What's wrong with just using Torchserve[1]? We've been using it to serve embedding models in production.

    [1] https://pytorch.org/serve/

  • openembeddings

    Discontinued Self-hostable pay for what you use embedding server for bge-large-en and arbitrary embedding models using crypto

  • It's much better to use something like bge-large-en - https://github.com/arguflow/openembeddings

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts