synthetic-dataset-generation

Open-source projects categorized as synthetic-dataset-generation

Top 15 synthetic-dataset-generation Open-Source Projects

  • AutoPrompt

    A framework for prompt tuning using Intent-based Prompt Calibration

  • Project mention: FLaNK 04 March 2024 | dev.to | 2024-03-04
  • com.unity.perception

    Perception toolkit for sim2real training and validation in Unity

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • DataDreamer

    DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

  • Project mention: FLaNK AI - 01 April 2024 | dev.to | 2024-04-01
  • pygraft

    Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips

  • Project mention: PyGraft: Configurable Generation of Schemas and Knowledge Graphs | news.ycombinator.com | 2023-09-13
  • bonito

    A lightweight library for generating synthetic instruction tuning datasets for your data without GPT. (by BatsResearch)

  • Project mention: FLaNK AI for 11 March 2024 | dev.to | 2024-03-11
  • SynthDet

    SynthDet - An end-to-end object detection pipeline using synthetic data

  • PeopleSansPeople

    Unity's privacy-preserving human-centric synthetic data generator

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • DoppelGANger

    [IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions

  • REaLTabFormer

    A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.

  • DeFMO

    [CVPR 2021] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects

  • VQASynth

    Compose multimodal datasets 🎹

  • Project mention: Show HN: VQASynth – pipelines to synthesize VQA datasets | news.ycombinator.com | 2024-02-23
  • discus

    A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ

  • Project mention: an open source package helping developers generate data for LLMs | /r/mlops | 2023-08-02
  • nist-crc-2023

    NIST Collaborative Research Cycle on Synthetic Data. Learn about Synthetic Data week by week!

  • Project mention: Assessing the Quality of Synthetic Data with Data-centric AI | /r/ArtificialInteligence | 2023-07-13

    Data Quality is key for all applications and models, and LLMs are no exception :) I've been working on a small community project with synthetic data using ydata-synthetic, and it really shows! Underrepresentation (category imbalance) and missing data are two of the main issues!

  • synthetic-dataset-object-detection

    How to Create Synthetic Dataset for Computer Vision (Object Detection) (Article on Medium)

  • tdk-demo

    This is a collection of TDK demo projects that use different databases and options

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

synthetic-dataset-generation related posts

  • The cute demo if you want to generate test data for your DB

    1 project | /r/data | 24 Mar 2023
  • The cute demo if you want to generate test data for your DB

    1 project | /r/Database | 24 Mar 2023
  • World Bank Researchers Open Source REaLTabFormer: A Tabular and Relational Synthetic Data Generation Model

    1 project | /r/machinelearningnews | 21 Feb 2023
  • REaLTabFormer: Generating realistic synthetic data using GPT in Python

    1 project | /r/Python | 20 Feb 2023
  • Show HN: REaLTabFormer – GPT-based synthetic data generator

    1 project | news.ycombinator.com | 17 Feb 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 21 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source synthetic-dataset-generation projects? This list will help you:

Project Stars
1 AutoPrompt 1,716
2 com.unity.perception 878
3 DataDreamer 681
4 pygraft 641
5 bonito 527
6 SynthDet 352
7 PeopleSansPeople 295
8 DoppelGANger 277
9 REaLTabFormer 184
10 DeFMO 164
11 VQASynth 82
12 discus 60
13 nist-crc-2023 27
14 synthetic-dataset-object-detection 20
15 tdk-demo 17

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com