-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
We now have a new chapter focusing on sparsity and clustering, two advanced compression techniques that you can use to reduce the footprint of your model (size, latency, etc.) while retaining your model accuracy. You can read the chapter here, and go through the accompanying codelabs here.
Thanks for sharing! That's a very timely topic. I've actually created a profiler to track and analyze inference optimizations, i.e. enable the optimize-verify-evaluate loop.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
-
Show HN: Python Monitoring for LLMs, OpenAI, Inference, GPUs
-
Show HN: Python Monitoring for AI: LLMs, OpenAI, Inference, GPUs
-
Show HN: Python Monitoring for AI: LLMs, OpenAI, Inference, GPUs
-
[N] Monitor OpenAI API Latency, Tokens, Rate Limits, and More with Graphsignal
-
Monitor OpenAI API Latency, Tokens, Rate Limits, and More