-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Gazette Inference Service is a baseplate.go (Reddit’s golang web services framework) thrift service whose single responsibility is serving ML inference requests to it’s clients. It is deployed with Reddit’s modern kubernetes infrastructure.
Minsky is an internal baseplate.py (Reddit’s python web services framework) thrift service owned by Reddit’s Machine Learning team that serves data or derivations of data related to content relevance heuristics — such as similarity between subreddits, a subreddits topic or a users propensity for a given subreddit — from various data stores such as Cassandra or in process caches. Clients of Minsky use this data to improve Redditor’s experiences with the most relevant content. Over the last few years a set of new ML capabilities, referred to as Gazette, were built into Minsky. Gazette is responsible for serving ML model inferences for personalization tasks along with configuration based schema resolution and feature fetching / transformation.
Minsky / Gazette is deployed on legacy Reddit infrastructure using puppet managed server bootstrapping and deployment rollouts managed by an internal tool called rollingpin. Application instances are deployed across a cluster of EC2 instances managed by an autoscaling group with 4 instances of the Minsky / Gazette thrift server launched on each instance within independent processes. Einhorn is then used to load balance requests from clients across the 4 Minsky / Gazette processes. There is no virtualization between the instances of Minsky / Gazette on a single EC2 instance so all instances share the same CPU and RAM.