Jupyter Notebook synthetic-data

Open-source Jupyter Notebook projects categorized as synthetic-data

Top 9 Jupyter Notebook synthetic-data Projects

  • machine-learning-for-trading

    Code for Machine Learning for Algorithmic Trading, 2nd edition.

  • Project mention: Machine Learning for Trading: Notebooks, resources and references accompanying the book Machine Learning for Algorithmic Trading. Courses - star count:10678.0 | /r/algoprojects | 2023-11-20
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ydata-synthetic

    Synthetic data generators for tabular and time-series data

  • Project mention: Coding Wonderland: Contribute to YData Profiling and YData Synthetic in this Advent of Code | dev.to | 2023-12-05

    Send us your North ⭐️: "On the first day of Christmas, my true contributor gave to me..." a star in my GitHub tree! 🎵 If you love these projects too, star ydata-profiling or ydata-synthetic and let your friends know why you love it so much!

  • awesome-data-centric-ai

    Open-Source Software, Tutorials, and Research on Data-Centric AI 🤖

  • Project mention: Thoughts: Continue current degree with one year left, or start anew with degree apprenticeship | /r/cscareerquestionsuk | 2023-07-13

    I would finish the degree anyway. It's only one year left. If teachers miss classes, I would disregard that and try to learn on my own, and then yes, I would move on to an internship (or even do It at the same time if it's possible). If you like, come as meet us at the Data-Centric AI Community and we can do some projects together :)

  • genalog

    Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

  • REaLTabFormer

    A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.

  • awesome-python-for-data-science

    A curated list of awesome resources such as books, tutorials, courses, open-source libraries, exercises, and other materials that support Pythonistas in the making, and Pythonistas migrating into Data Science! 📊

  • Project mention: [D] Best tools to learn data science nowadays? | /r/MachineLearning | 2023-07-28

    We're updating our awesome-python-for-data-science repository.

  • synthetic-data-genomics

    Proof of concept code from Gretel.ai and Illumina using generative neural networks to create synthetic versions of mouse genotype and phenotype data.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • nist-crc-2023

    NIST Collaborative Research Cycle on Synthetic Data. Learn about Synthetic Data week by week!

  • Project mention: Assessing the Quality of Synthetic Data with Data-centric AI | /r/ArtificialInteligence | 2023-07-13

    Data Quality is key for all applications and models, and LLMs are no exception :) I've been working on a small community project with synthetic data using ydata-synthetic, and it really shows! Underrepresentation (category imbalance) and missing data are two of the main issues!

  • multi-table

    Notebook and code to synthesize relational databases such as Postgres and Mysql.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Jupyter Notebook synthetic-data discussion

Log in or Post with

Jupyter Notebook synthetic-data related posts


What are some of the best open-source synthetic-data projects in Jupyter Notebook? This list will help you:

Project Stars
1 machine-learning-for-trading 12,080
2 ydata-synthetic 1,345
3 awesome-data-centric-ai 306
4 genalog 296
5 REaLTabFormer 188
6 awesome-python-for-data-science 70
7 synthetic-data-genomics 32
8 nist-crc-2023 27
9 multi-table 7

Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.