Scaling Temporal: The Basics

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. kube-prometheus

    Use Prometheus to monitor Kubernetes and applications running on Kubernetes

    For our load testing we’ve deployed Temporal on Kubernetes, and we’re using MySQL for the persistence backend. The MySQL instance has 4 CPU cores and 32GB RAM, and each Temporal service (Frontend, History, Matching, and Worker) has 2 pods, with requests for 1 CPU core and 1GB RAM as a starting point. We’re not setting CPU limits for our pods—see our upcoming Temporal on Kubernetes post for more details on why. For monitoring we’ll use Prometheus and Grafana, installed via the kube-prometheus stack, giving us some useful Kubernetes metrics.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. benchmark-workers

    Pre-written workflows and activities useful for benchmarking Temporal

    To create load on a Temporal Cluster, we need to start Workflows and have Workers to run them. To make it easy to set up load tests, we have packaged a simple Workflow and some Activities in the benchmark-workers package. Running a benchmark-worker container will bring up a load test Worker with default Temporal Go SDK settings. The only configuration it needs out of the box is the host and port for the Temporal Frontend service.

  4. temporal

    Temporal service

    However, as we mentioned, each shard needs management. Part of the management includes a cache of Workflow histories for that shard. We can see the History pods’ memory usage is rising quickly. If the pods run out of memory, Kubernetes will terminate and restart them (OOMKilled). This causes Temporal to rebalance the shards onto the remaining History pod(s), only to then rebalance again once the new History pod comes up. Each time you make a scaling change, be sure to check that all Temporal pods are still within their CPU and memory requests—pods frequently being restarted is very bad for performance! To fix this, we can bump the memory limits for the History containers. Currently, it is hard to estimate the amount of memory a History pod is going to use because the limits are not set per host, or even in MB, but rather as a number of cache entries to store. There is work to improve this: github.com/temporalio/temporal/issues/2941. For now, we’ll set the History memory limit to 8GB and keep an eye on them—we can always raise it later if we find the pod needs more.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Cadence – Fault-Tolerant Stateful Code Platform by Uber

    1 project | news.ycombinator.com | 17 Oct 2023
  • temporal VS laravel-workflow - a user suggested alternative

    2 projects | 23 Aug 2023
  • Temporal .NET – Deterministic Workflow Authoring in .NET

    3 projects | news.ycombinator.com | 3 May 2023
  • Mandala: experiment data management as a built-in (Python) language feature

    4 projects | /r/ProgrammingLanguages | 11 Apr 2023
  • are you interested in an end to end queue/pubsub & worker platform

    1 project | /r/devops | 14 Mar 2023

Did you know that Go is
the 4th most popular programming language
based on number of references?