Multi-model serving options

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

jina

126 20,009 9.2 Python

☁️ Build multimodal AI applications with cloud-native stack

Jina let’s you serve all of your models through the same Gateway while deploying them as individual microservices. You can also tie your models together in a pipeline if needed. Also some nice ML focussed features such as dynamic batching.

server

24 7,314 9.5 Python

The Triton Inference Server provides an optimized cloud and edge inferencing solution. (by triton-inference-server)

You've already mentioned Seldon Core which is well worth looking at but if you're just after the raw multi-model serving aspect rather than a fully-fledged deployment framework you should maybe take a look at the individual inference servers: Triton Inference Server and MLServer both support multi-model serving for a wide variety of frameworks (and custom python models). MLServer might be a better option as it has an MLFlow runtime but only you will be able to decide that. There also might be other inference servers that do MMS that I'm not aware of.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
MLServer

4 568 9.3 Python

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

You've already mentioned Seldon Core which is well worth looking at but if you're just after the raw multi-model serving aspect rather than a fully-fledged deployment framework you should maybe take a look at the individual inference servers: Triton Inference Server and MLServer both support multi-model serving for a wide variety of frameworks (and custom python models). MLServer might be a better option as it has an MLFlow runtime but only you will be able to decide that. There also might be other inference servers that do MMS that I'm not aware of.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project