How DataStax Tracked Down a Linux Kernel Bug with Fallout

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • Apache Pulsar

    Apache Pulsar - distributed pub-sub messaging system

    Fallout is an open-source distributed systems testing service that we use heavily at DataStax to run functional and performance tests for Apache Cassandra, Apache Pulsar, and other projects. Fallout automatically provisions and configures distributed systems and clients, runs a variety of workloads and benchmarks, and gathers test results for later analysis.

  • prometheus

    The Prometheus monitoring system and time series database.

    Cassandra has a number of tools to understand what’s happening internally as it serves data to clients. A lot of this is tracked as metrics in monitoring tools such as Prometheus or Grafana which integrate with Fallout. Checking those metrics showed that request throughput (requests per second) dropped off to near zero whenever we triggered the bug. To understand what was happening on the server side, I waited for the bug to occur and then took a look at the output of nodetool tpstats.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • Grafana

    The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

    Cassandra has a number of tools to understand what’s happening internally as it serves data to clients. A lot of this is tracked as metrics in monitoring tools such as Prometheus or Grafana which integrate with Fallout. Checking those metrics showed that request throughput (requests per second) dropped off to near zero whenever we triggered the bug. To understand what was happening on the server side, I waited for the bug to occur and then took a look at the output of nodetool tpstats.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts