Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
You might be thinking of finetuning, you should checkout alpaca-lora documentation.
Embeddings are just floating point array representation of the underlying text, where 'tokens' that are often used together are numbers close in proximity. You can generate vector representation of any text very easily using the openai apis, or frameworks like langchain https://github.com/openai/openai-cookbook/blob/main/examples...
Similarity means your text gets turned into an array of numbers and the vector database finds the closest matches. Vector databases are often used in conjunction to LLMs, for example to pull out all the snippets of relevant text, then feed it within the prompt with your question.
I think Supabase generally does good work, but I don't think they can be given credit for pgvector, if that's what you're indicating (I might have misread).
As I understand, Andrew Kane is the principal author of pgvector, and has worked on it for almost two years before Supabase added support for it.
See also https://github.com/pgvector/pgvector/issues/54 and https://github.com/supabase/postgres/pull/472.
I think Supabase generally does good work, but I don't think they can be given credit for pgvector, if that's what you're indicating (I might have misread).
As I understand, Andrew Kane is the principal author of pgvector, and has worked on it for almost two years before Supabase added support for it.
See also https://github.com/pgvector/pgvector/issues/54 and https://github.com/supabase/postgres/pull/472.