Generate Synthetic Data in 3 Lines of Code

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • multi-table

    Notebook and code to synthesize relational databases such as Postgres and Mysql.

  • We have some preliminary work in this direction https://github.com/gretelai/multi-table

    I love the idea of "table space" though. It would be fun to traverse this space and output a new database at each step, like a VAE.

  • condenser

    Condenser is a database subsetting tool

  • Thanks for the shoutout, cush! And yup, our platform Tonic enables developers to realistically de-identify their data while preserving relationships and consistency across tables within their DBs, to optimize dev and test with real fake data. You can sign up for a sandbox here: https://www.tonic.ai/

    We've also recently released a new platform called Djinn that is specifically designed for data science workflows. It enables you to query from tables across your DB to build customized views of only the data you need and synthesize high-fidelity data based on models trained on those views. Relationships are fully preserved and no external scripting is required. You can create an account and take it for a spin here: https://djinn.tonic.ai/?signup

    Full disclosure, I'm Chiara Colombi, Product Marketing Manager at Tonic.ai. Cheers!

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Recommendation for tool or script for sanitizing data

    1 project | /r/SQLServer | 12 Jan 2023
  • Is it atypical to have a dev DB service on your local environment?

    1 project | /r/ExperiencedDevs | 1 Dec 2022
  • Anonymize test data?

    1 project | /r/softwaretesting | 30 Nov 2022
  • Preserve the unique relationships between data columns while wiping sensitive information from those columns using randomization.

    1 project | /r/u_Tonic_ai | 21 Sep 2022
  • Don't let your test data suffer - meet the Tonic and Google BigQuery partnership.

    1 project | /r/u_Tonic_ai | 21 Sep 2022