Reddit Recap Series: Backend Performance Tuning

This page summarizes the projects mentioned and recommended in the original post on /r/RedditEng

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • baseplate.py

    reddit's python service framework

  • Finally, the problem that we didn’t experience directly, but it was mentioned during consultations with another team that had experience with pgBouncer: the Baseplate.py framework that both of us are using sometimes leaked the connections, leaving them open after the request, but not returning them back into the pool.

  • SQLAlchemy

    The Database Toolkit for Python

  • The second problem was caused by the pgBouncer setup. pgBouncer is an impostor that owns several dozen of real PostgreSQL connections, but pretends that it has thousands of them available for the backend services. Similar to fractional-reserve banking. So, it needs a way to find out when the real DB connection becomes free and can be used by another service. Our pgBouncer was configured as pool_mode=transaction. I.e., it detected when the current transaction was over, and returned the PostgreSQL connection into the pool, making it available to other users. However, this mode was found to not work well with the code that was using SQLAlchemy: committing the current transaction immediately started a new one. So, the expensive connection between pgBouncer and PostgreSQL remained checked out as long as the connection from service to pgBouncer remained open (forever, or close to that).

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • PostgreSQL

    Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch

  • The way Recap uses a database is: in the very beginning of an HTTP request’s handler’s execution, it sends a single SELECT into PostgreSQL, and retrieves a single JSON with a particular user’s Recap data. After that, it’s done with the database, and continues to hydrate this data by querying a dozen of external services.

  • pgbouncer

    lightweight connection pooler for PostgreSQL

  • Our backend services are using pgBouncer to pool PostgreSQL connections. During load testing, we found 2 problematic areas:

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts