-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
We've added a simple way to profile any model serving endpoint, including FastAPI, to identify bottlenecks and make inference (incl. data processing) faster, especially for big models and data. Wanted to share it here in case someone is struggling with profiling and monitoring of deployed code and models. By default, generic Python profiler will automatically profile some of the inferences (and measure all inferences). You can also specify other profilers for PyTorch, TensorFlow, Jax and ONNX Runtime. All profiles and metrics will be available on the SaaS dashboard, no need to setup anything. A couple of links to get started: Repo: https://github.com/graphsignal/graphsignal FastAPI example: https://graphsignal.com/docs/integrations/fastapi/ Happy for any feedback!
Related posts
-
Show HN: Python Monitoring for LLMs, OpenAI, Inference, GPUs
-
Show HN: Python Monitoring for AI: LLMs, OpenAI, Inference, GPUs
-
Show HN: Python Monitoring for AI: LLMs, OpenAI, Inference, GPUs
-
[N] Monitor OpenAI API Latency, Tokens, Rate Limits, and More with Graphsignal
-
Monitor OpenAI API Latency, Tokens, Rate Limits, and More