The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 19 data-generation Open-Source Projects
-
Grounded-Segment-Anything
Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Can someone help me understand the licensing of this?
https://github.com/sdv-dev/SDV/blob/main/LICENSE
It was MIT licensed up until 2022 where it was changed to what it is now, where they say that it will become MIT again 4 years after release... but is that from when the license was changed or the first release of the software in GitHub?
-
Project mention: Ctgan: Generating synthetic data in Python using GANs | news.ycombinator.com | 2024-02-05
-
-
-
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
-
genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
-
REaLTabFormer
A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
-
rapiddweller-benerator-ce
BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.
-
-
-
You could generate synthetic data to build your dashboard, either with normal Python or something like https://github.com/tinybirdco/mockingbird. Or, get some old data and have a script push it row by row into Kafka to emulate a stream.
-
hypothesis-graphql
Generate arbitrary queries matching your GraphQL schema, and use them to verify your backend implementation.
-
trainer
Simple interface to synthesize complex and highly dimensional datasets using Gretel APIs. (by gretelai)
-
-
Project mention: Show HN: Data Caterer – Data generation and validation tool | news.ycombinator.com | 2024-03-22
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
data-generation related posts
- Show HN: Data Caterer – Data generation and validation tool
- Show HN: Data Caterer – Data generation and validation tool
- Ctgan: Generating synthetic data in Python using GANs
- Synth: A tool for generating realistic data using a declarative data model
- Streaming analytics
- What are some good publicly available real-time data sources?
- The cute demo if you want to generate test data for your DB
-
A note from our sponsor - WorkOS
workos.com | 17 Apr 2024
Index
What are some of the best open-source data-generation projects? This list will help you:
Project | Stars | |
---|---|---|
1 | Grounded-Segment-Anything | 13,314 |
2 | generatedata | 2,173 |
3 | SDV | 2,105 |
4 | CTGAN | 1,130 |
5 | StreamData | 838 |
6 | Mockneat | 523 |
7 | regexp-examples | 520 |
8 | Copulas | 501 |
9 | genalog | 294 |
10 | REaLTabFormer | 182 |
11 | rapiddweller-benerator-ce | 128 |
12 | awesome-synthetic-data | 90 |
13 | DeepEcho | 87 |
14 | mockingbird | 70 |
15 | hypothesis-graphql | 40 |
16 | trainer | 28 |
17 | tdk-demo | 16 |
18 | data-caterer | 12 |
19 | dummPy | 0 |