SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python synthetic-data Projects
-
Mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
Can someone help me understand the licensing of this?
https://github.com/sdv-dev/SDV/blob/main/LICENSE
It was MIT licensed up until 2022 where it was changed to what it is now, where they say that it will become MIT again 4 years after release... but is that from when the license was changed or the first release of the software in GitHub?
-
Project mention: Ctgan: Generating synthetic data in Python using GANs | news.ycombinator.com | 2024-02-05
-
-
-
bonito
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT. (by BatsResearch)
-
gretel-synthetics
Synthetic data generators for structured and unstructured text, featuring differentially private learning.
-
-
synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
-
DoppelGANger
[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
-
-
Robotics-Object-Pose-Estimation
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.
-
-
Project mention: SDMetrics: Library for evaluating synthetic data quality | news.ycombinator.com | 2024-04-12
-
AgML
AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.
Project mention: Access to public agricultural datasets for agricultural deep learning tasks | news.ycombinator.com | 2023-11-05 -
edsl
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
Project mention: Python Library for Structured Data Extraction via LLM | news.ycombinator.com | 2024-08-14Hey thanks for noticing - here's the MIT licensed library it's based on: https://github.com/expectedparrot/edsl
-
FAST-RIR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
-
-
Main
Main folder. Material related to my books on synthetic data and generative AI. Also contains documents blending components from several folders, or covering topics spanning across multiple folders.. (by VincentGranville)
-
-
discus
A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ
-
Python synthetic-data discussion
Python synthetic-data related posts
-
Launch HN: Trellis (YC W24) – AI-powered workflows for unstructured data
-
SDMetrics: Library for evaluating synthetic data quality
-
Synthetic data generation for tabular data
-
Ctgan: Generating synthetic data in Python using GANs
-
Phibrarian Alpha - the first model checkpoint from SciPhi's Mistral-7b
-
With LLMs we can create a fully open-source Library of Alexandria.
-
Textbook was authored with an AI pipeline
-
A note from our sponsor - SaaSHub
www.saashub.com | 7 Oct 2024
Index
What are some of the best open-source synthetic-data projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Mimesis | 4,400 |
2 | BlenderProc | 2,758 |
3 | SDV | 2,321 |
4 | CTGAN | 1,243 |
5 | DataDreamer | 808 |
6 | pygraft | 664 |
7 | bonito | 662 |
8 | gretel-synthetics | 580 |
9 | Copulas | 545 |
10 | synthcity | 431 |
11 | DoppelGANger | 298 |
12 | zpy | 298 |
13 | Robotics-Object-Pose-Estimation | 271 |
14 | SDGym | 254 |
15 | SDMetrics | 205 |
16 | AgML | 177 |
17 | edsl | 176 |
18 | FAST-RIR | 151 |
19 | DeepEcho | 101 |
20 | Main | 82 |
21 | anonymeter | 67 |
22 | discus | 63 |
23 | gretel-python-client | 53 |