Our great sponsors
-
local_llama
This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
You may want to take a look at Zep, which supports stateless agents by storing messages out of process (in a memory store). It does a bunch of other things, too, such as summarizing, entity extraction, vector search over historical memory. Disclosure: I’m a coauthor. https://github.com/getzep/zep
I work with AWS daily, terraform, Python and java creating and maintaining enterprise solutions. I have played with sagemaker but it is so expensive I hate to leave it up for longer than a day. I downloaded and created a chat with your docs (entirely in airplane mode) here point being that I’ve hosted models both locally and in the cloud. But just ended up sticking to API calls as it’s so cheap
Related posts
- Zep: Fast, scalable building blocks for production LLM apps
- ICYMI August: Zep Vector DB, User Store, LangChain collabs & more!
- Show HN: Zep – pgvector-based memory store for LLM apps
- Zep: A fast, async memory store for LLM applications
- Handling chat histories that are longer than the context length?