[D] patterns for scaling video inference service?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • server

    The Triton Inference Server provides an optimized cloud and edge inferencing solution. (by triton-inference-server)

    If you are working at an infrastructure level I would use ECS and utilize the NVIDIA Triton Inference Server. It can handle the multimodel paradigm through their ensemble method (bit of a misnomer since its really just a DAG of data flow through your models though you can add an ensembling method at the end of desired). Also provides a nice HTTP or GRPC interface. With ECS you can also use Application Load Balancer to further scale but how you set that up will also heavily depend on if you are using stateful models or not.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts