Python synthetic-data

Open-source Python projects categorized as synthetic-data

Top 23 Python synthetic-data Projects

synthetic-data
  • Mimesis

    Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • BlenderProc

    A procedural Blender pipeline for photorealistic training image generation

  • SDV

    Synthetic data generation for tabular data

    Project mention: Synthetic data generation for tabular data | news.ycombinator.com | 2024-02-27

    Can someone help me understand the licensing of this?

    https://github.com/sdv-dev/SDV/blob/main/LICENSE

    It was MIT licensed up until 2022 where it was changed to what it is now, where they say that it will become MIT again 4 years after release... but is that from when the license was changed or the first release of the software in GitHub?

  • CTGAN

    Conditional GAN for generating synthetic tabular data.

    Project mention: Ctgan: Generating synthetic data in Python using GANs | news.ycombinator.com | 2024-02-05
  • DataDreamer

    DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

    Project mention: FLaNK AI - 01 April 2024 | dev.to | 2024-04-01
  • pygraft

    Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips

  • bonito

    A lightweight library for generating synthetic instruction tuning datasets for your data without GPT. (by BatsResearch)

    Project mention: FLaNK AI for 11 March 2024 | dev.to | 2024-03-11
  • gretel-synthetics

    Synthetic data generators for structured and unstructured text, featuring differentially private learning.

  • Copulas

    A library to model multivariate data using copulas.

  • synthcity

    A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.

  • DoppelGANger

    [IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions

  • zpy

    Synthetic data for computer vision. An open source toolkit using Blender and Python.

  • Robotics-Object-Pose-Estimation

    A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

  • SDGym

    Benchmarking synthetic data generation methods.

  • SDMetrics

    Metrics to evaluate quality and efficacy of synthetic datasets.

    Project mention: SDMetrics: Library for evaluating synthetic data quality | news.ycombinator.com | 2024-04-12
  • AgML

    AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.

    Project mention: Access to public agricultural datasets for agricultural deep learning tasks | news.ycombinator.com | 2023-11-05
  • edsl

    Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.

    Project mention: Python Library for Structured Data Extraction via LLM | news.ycombinator.com | 2024-08-14

    Hey thanks for noticing - here's the MIT licensed library it's based on: https://github.com/expectedparrot/edsl

  • FAST-RIR

    This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

  • DeepEcho

    Synthetic Data Generation for mixed-type, multivariate time series.

    Project mention: DeepEcho: Synthetic Data Generation Library | news.ycombinator.com | 2024-02-05
  • Main

    Main folder. Material related to my books on synthetic data and generative AI. Also contains documents blending components from several folders, or covering topics spanning across multiple folders.. (by VincentGranville)

    Project mention: Synthetic Data Benchmark [pdf] | news.ycombinator.com | 2024-06-21
  • anonymeter

    A Unified Framework for Quantifying Privacy Risk in Synthetic Data according to the GDPR

  • discus

    A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ

  • gretel-python-client

    The Gretel Python Client allows you to interact with the Gretel REST API.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python synthetic-data discussion

Log in or Post with

Python synthetic-data related posts

Index

What are some of the best open-source synthetic-data projects in Python? This list will help you:

Project Stars
1 Mimesis 4,400
2 BlenderProc 2,758
3 SDV 2,321
4 CTGAN 1,243
5 DataDreamer 808
6 pygraft 664
7 bonito 662
8 gretel-synthetics 580
9 Copulas 545
10 synthcity 431
11 DoppelGANger 298
12 zpy 298
13 Robotics-Object-Pose-Estimation 271
14 SDGym 254
15 SDMetrics 205
16 AgML 177
17 edsl 176
18 FAST-RIR 151
19 DeepEcho 101
20 Main 82
21 anonymeter 67
22 discus 63
23 gretel-python-client 53

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you konow that Python is
the 1st most popular programming language
based on number of metions?