Cascade of doom: JIT, and how a Postgres update led to 70% failure on a critical national service

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • coronavirus-dashboard-summary

    UK Coronavirus Dashboard - Summary pages

  • We released an new version of summary pages implemented in F#.

  • coronavirus-dashboard-pipeline-etl

    UK Coronavirus Dashboard ETL

  • use the cache pre-population solution implemented in our ETL

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • pgbouncer

    lightweight connection pooler for PostgreSQL

  • Towards the end of the week, a few people reached out to me saying that they were encountering an increasing number of failed requests when using APIv2. They were peculiarly saying that they couldn't download new data from the API after the release. My first action in such cases is to check the API service, and our PGBouncer instances to ensure that they are healthy. Then I usually clear the storage cache for APIv2, which usually solves the problem.

  • coronavirus-dashboard-generic-apis

    Coronavirus Dashboard (COVID-19) in the UK - Generic APIs

  • Number of grievances directed to me in Twitter increased. People were now reporting increased latency in our Generic API. This concerned me because unlike APIv2, which is aggressively throttled and is designed to download very large payloads, the Generic API has been implemented in the Go programming language using highly optimised queries for rapid response. Whilst a download from APIv2 may take 3 or 4 minutes to complete, the average latency for our Generic API is ~50 milliseconds.

  • llvm-project

    The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

  • Based on discussions in the issue ticket, it appears that the problem is associated with a setting entitled Just-in-time Compilation (JIT) that is turned on by default in Postgres 14 when it is compiled using LLVM.

  • asyncpg

    A fast PostgreSQL Database Client Library for Python/asyncio.

  • Simple query runs long when DB schema contains thousands of tables #186

  • coronavirus-dashboard

    Dashboard for tracking Coronavirus (COVID-19) across the UK

  • The UK coronavirus dashboard is the primary data reporting service for the COVID-19 pandemic in the United Kingdom. It sustains on average 45 to 50 million hits a day, and is regarded as a critical service.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts