Our great sponsors
-
litellm
Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Hi @deet - Yes it is! This actually automatically stores the cost per query to the Supabase table - here's how: https://github.com/BerriAI/litellm/blob/80d77fed7123af222011...
If you have ideas for improvement - we'd love a ticket/PR!
The idea of an LLM proxy is super compelling, there's a lot of powerful ideas baked into the proxy form factor. It reminds me a bit of what Cloudflare did for the web both making it faster and safer/easier. Have you considered local LLMs at all for Llama 2? A few people and I have been working on https://github.com/jmorganca/ollama/ and was thinking how cool it would be to augment it with a proxy layer like this.
If you do want to self-host - there's some great libraries like https://github.com/lm-sys/FastChat and https://github.com/ggerganov/llama.cpp that might be helpful
If none of these really solve your issue - feel free to email me and I'm happy to help you figure something out - [email protected]
If you do want to self-host - there's some great libraries like https://github.com/lm-sys/FastChat and https://github.com/ggerganov/llama.cpp that might be helpful
If none of these really solve your issue - feel free to email me and I'm happy to help you figure something out - [email protected]