A Tale of Hashery and Woe: How Mutable Hash Keys Led to an ActiveRecord Bug

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • ruby

    The Ruby Programming Language

  • static unsigned ar_find_entry_hint(VALUE hash, ar_hint_t hint, st_data_t key) { unsigned i, bound = RHASH_AR_TABLE_BOUND(hash); const ar_hint_t *hints = RHASH(hash)->ar_hint.ary; for (i = 0; i < bound; i++) { if (hints[i] == hint) { ar_table_pair *pair = RHASH_AR_TABLE_REF(hash, i); if (ar_equal(key, pair->key)) { return i; } } } return RHASH_AR_TABLE_MAX_BOUND; } # https://github.com/ruby/ruby/blob/ruby_3_2/hash.c#L701-L742

  • Ruby on Rails

    Ruby on Rails

  • def cache_sql(sql, name, binds) @lock.synchronize do result = if @query_cache[sql].key?(binds) @query_cache[sql][binds] else @query_cache[sql][binds] = yield end result.dup end end # https://github.com/rails/rails/blob/v7.0.4/activerecord/lib/active_record/connection_adapters/abstract/query_cache.rb#L127-L141

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • rubygems

    Library packaging and distribution for Ruby.

  • # frozen_string_literal: true require "bundler/inline" gemfile(true) do source "https://rubygems.org" git_source(:github) { |repo| "https://github.com/#{repo}.git" } gem "rails", github: "rails/rails", branch: "main" gem "sqlite3" end require "active_record" require "minitest/autorun" require "logger" # This connection will do for database-independent bug reports. ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:", prepared_statements: true) # ActiveRecord::Base.logger = Logger.new(STDOUT) # you can enable this to see the cache loads, but it's noisy ActiveRecord::Schema.define do create_table :my_records, force: true do |t| t.json :value t.text :description end end class MyRecord < ActiveRecord::Base; end class QueryCacheMutableSearchTest < Minitest::Test def test_bug iterations = 10000 false_positives = 0 MyRecord.connection.enable_query_cache! iterations.times do key, val = rand(100000), rand(100000) record = MyRecord.create(value: { key => val }, description: "The record we want to find") search = { key => val } the_record = MyRecord.where(value: search).first # this should populate the cache assert the_record.present? # cache now looks like this, essentially: # { "SELECT * FROM my_records WHERE value = $1" => # { [search] => the_record } # } new_val = rand(100000) until new_val != val search.merge!(key => new_val) # this mutates the key inside the query cache # normally: because the hash of the key has changed, this is a cache miss # however, if the new hash key's numerical hash falls into the same bucket # as the original, the hash lookup will a) find the first query's entry and # b) use it, because the objects are equal b/c the `search` hash was mutated # is equal to key_obj (since it's a reference) should_not_exist = MyRecord.where(value: search).first # this SHOULD not return a value false_positives += 1 if should_not_exist.present? record.destroy MyRecord.connection.clear_query_cache end assert_equal 0, false_positives end end

  • Puts Debuggerer

    Ruby library for improved puts debugging, automatically displaying bonus useful information such as source line number and source code.

  • # frozen_string_literal: true require "bundler/inline" gemfile(true) do source "https://rubygems.org" git_source(:github) { |repo| "https://github.com/#{repo}.git" } gem "rails", github: "rails/rails", branch: "main" gem "sqlite3" end require "active_record" require "minitest/autorun" require "logger" # This connection will do for database-independent bug reports. ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: ":memory:", prepared_statements: true) # ActiveRecord::Base.logger = Logger.new(STDOUT) # you can enable this to see the cache loads, but it's noisy ActiveRecord::Schema.define do create_table :my_records, force: true do |t| t.json :value t.text :description end end class MyRecord < ActiveRecord::Base; end class QueryCacheMutableSearchTest < Minitest::Test def test_bug iterations = 10000 false_positives = 0 MyRecord.connection.enable_query_cache! iterations.times do key, val = rand(100000), rand(100000) record = MyRecord.create(value: { key => val }, description: "The record we want to find") search = { key => val } the_record = MyRecord.where(value: search).first # this should populate the cache assert the_record.present? # cache now looks like this, essentially: # { "SELECT * FROM my_records WHERE value = $1" => # { [search] => the_record } # } new_val = rand(100000) until new_val != val search.merge!(key => new_val) # this mutates the key inside the query cache # normally: because the hash of the key has changed, this is a cache miss # however, if the new hash key's numerical hash falls into the same bucket # as the original, the hash lookup will a) find the first query's entry and # b) use it, because the objects are equal b/c the `search` hash was mutated # is equal to key_obj (since it's a reference) should_not_exist = MyRecord.where(value: search).first # this SHOULD not return a value false_positives += 1 if should_not_exist.present? record.destroy MyRecord.connection.clear_query_cache end assert_equal 0, false_positives end end

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts