How to create a 1M record table with a single query

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • zefaker

    zefaker is a command-line tool for generating CSV, Excel, JSON and SQL files from a Groovy DSL

  • If you need another repeatable way to create random data you can export as SQL INSERTs (or CSV/Excel files) you may find a tool we built and use at work useful: https://github.com/creditdatamw/zefaker

    Needs a little Groovy but very convenient for generating random (or non-random) data

  • synth

    Discontinued The Declarative Data Generator

  • This looks convenient (and performant). But how does it scale as queries join across tables?

    If you need to create test data with complex business logic, referential integrity and constraints we've been working on declarative data generator that is build exactly for this: https://github.com/openquery-io/synth.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • faker

    Faker is a Python package that generates fake data for you. (by joke2k)

  • Creating realistic fake data is useful in lower environments and for load testing. Outside of SQL I like faker: https://github.com/joke2k/faker

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts