Our great sponsors
-
qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Otherwise, things were easy to start up and play around with, even for an AI noob like me. Both their web and text-upload source connectors worked without issue in my testing.
[1]: https://github.com/danswer-ai/danswer/pull/139
Check this out. Built on a vector database (https://github.com/milvus-io/milvus) and a semantic cache (https://github.com/zilliztech/GPTCache)
https://osschat.io/
Check this out. Built on a vector database (https://github.com/milvus-io/milvus) and a semantic cache (https://github.com/zilliztech/GPTCache)
https://osschat.io/
Nowadays falcon-40b is probably more accurate than gpt4j, here's to hoping we get llama.cpp support for falcon builds soon [0]!
[0]: https://github.com/ggerganov/llama.cpp/issues/1602
The GGLLM fork seems to be the leading falcon winner for now [1]
It comes with its own variant of the GGML sub format "ggcv1" but there's quants available on HF [2]
Although if you have a GPU I'd go with the newly released AWQ quantization instead [3] the performance is better.
(I may or may not have a mild local LLM addiction - and video cards cost more then drugs)
[1] https://github.com/cmp-nct/ggllm.cpp
[2] https://huggingface.co/TheBloke/falcon-7b-instruct-GGML
[3] https://huggingface.co/abhinavkulkarni/tiiuae-falcon-7b-inst...