Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Love how simple of an interface this has. Local LLM tooling can be super daunting, but reducing it to a simple ingest() and then prompt() is really neat.
By chance, have you checked out Ollama (https://github.com/jmorganca/ollama) as a way to run the models like Llama 2? One of the goals of the project is to make it easy to download and run GPU-accelerated models, ideally with everything pre-compiled so it's easy to get up and running.
There's a LangChain model integration for it and a PrivateGPT example as well: https://github.com/jmorganca/ollama/tree/main/examples/priva.... Thought I'd share.
Best of luck with the project!
I've been playing around with https://github.com/imartinez/privateGPT and https://github.com/simonw/llm and wanted to create a simple Python package that made it easier to run ChatGPT-like LLMs on your own machine, use them with non-public data, and integrate them into practical applications.
This resulted in Python package I call OnPrem.LLM.
In the documentation, there are examples for how to use it for information extraction, text generation, retrieval-augmented generation (i.e., chatting with documents on your computer), and text-to-code generation: https://amaiya.github.io/onprem/
Enjoy!
I've been playing around with https://github.com/imartinez/privateGPT and https://github.com/simonw/llm and wanted to create a simple Python package that made it easier to run ChatGPT-like LLMs on your own machine, use them with non-public data, and integrate them into practical applications.
This resulted in Python package I call OnPrem.LLM.
In the documentation, there are examples for how to use it for information extraction, text generation, retrieval-augmented generation (i.e., chatting with documents on your computer), and text-to-code generation: https://amaiya.github.io/onprem/
Enjoy!
Saving you some time, if you have a Macbook pro M1/M2 with 32GB of RAM (I presume a lot of HN folks would), you can comfortably run the `34B` models.
If you'd like a more hands on approach, get llama from here
- https://github.com/ggerganov/llama.cpp
This looks great! I appreciate having more options to easily run these models locally.
I've been working on something similar:
https://github.com/jncraton/languagemodels
The target audience is a bit different, as my personal use-case for this is allowing students to work with these models in introductory CS courses, but many of the design goals appear to be similar.
I like how you completely hide the complexity of the vector DB behind `ingest`. It looks like you've made it very easy to build local RAG apps.