Show Show HN: Llama2 Embeddings FastAPI Server

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

swiss_army_llama

11 867 8.8 Python

A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.

Thanks for pointing out those models. I see from a quick Huggingface search that the bge model is available in GGML format. You can trivially add new GGML format models to the code by simply adding the direct download link to this line:
https://github.com/Dicklesworthstone/llama_embeddings_fastap...
So to add the base bge model, you could just add this URL to the list:
https://huggingface.co/maikaarda/bge-base-en-ggml/resolve/ma...
I will add that as an additional default.

llama_embeddings_fastap

2 - -

Thanks for pointing out those models. I see from a quick Huggingface search that the bge model is available in GGML format. You can trivially add new GGML format models to the code by simply adding the direct download link to this line:
https://github.com/Dicklesworthstone/llama_embeddings_fastap...
So to add the base bge model, you could just add this URL to the list:
https://huggingface.co/maikaarda/bge-base-en-ggml/resolve/ma...
I will add that as an additional default.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
serve

11 3,961 9.5 Java

Serve, optimize and scale PyTorch models in production (by pytorch)

What's wrong with just using Torchserve[1]? We've been using it to serve embedding models in production.
[1] https://pytorch.org/serve/

openembeddings

1 3 10.0 JavaScript

Discontinued Self-hostable pay for what you use embedding server for bge-large-en and arbitrary embedding models using crypto

It's much better to use something like bge-large-en - https://github.com/arguflow/openembeddings

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project