versatile-data-kit

Build, run and manage your data pipelines with Python or SQL on any cloud (by vmware)

Versatile-data-kit Alternatives

Similar projects and alternatives to versatile-data-kit

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better versatile-data-kit alternative or higher similarity.

versatile-data-kit reviews and mentions

Posts with mentions or reviews of versatile-data-kit. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-23.
  • Can we take a moment to appreciate how much of dataengineering is open source?
    8 projects | reddit.com/r/dataengineering | 23 Nov 2022
    Free, Python+SQL ELT pipelines framework with orchestration functionality https://github.com/vmware/versatile-data-kit
    8 projects | reddit.com/r/dataengineering | 23 Nov 2022
    If you wish to contribute, projects usually have good first issues: https://github.com/vmware/versatile-data-kit/labels/good%20first%20issue If you wish to learn, check out examples: https://github.com/vmware/versatile-data-kit/tree/main/examples
  • DE Open Source
    2 projects | reddit.com/r/dataengineering | 13 Nov 2022
    Versatile Data Kit is a framework to bBuild, run and manage your data pipelines with Python or SQL on any cloud https://github.com/vmware/versatile-data-kit here's a list of good first issues: https://github.com/vmware/versatile-data-kit/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22 Join our slack channel to connect with our team: https://cloud-native.slack.com/archives/C033PSLKCPR
  • What is a personality type of a Data Engineer?
    2 projects | reddit.com/r/dataengineering | 26 Oct 2022
    Okay, I will explain what I am doing and how I see the "fun" in the project. I work with an open-source framework for data engineers. The community members are developers and people who use the tool - DEs. Indeed, I am facilitating a monthly community meeting for everyone to meet and discuss important topics, but that's the only part that takes their direct time, and it's totally voluntary, so DEs usually don't join, but I'm glad that the developers are joining and participating. What I was having in mind is more of a design and promotion question. I have a vision for open source projects to have a feel of friendliness, and openness (fun) which I communicate through design and visuals that are part of the repo and information we share about the project. And, as I don't find long texts engaging, because I literally can't focus when I see a long description of, say, a GitHub repo, I have an internal struggle against very detailed descriptions. That said, I am having an internal wish to transform the project into something more like this: https://github.com/mage-ai/mage-ai Instead of this: https://github.com/vmware/versatile-data-kit But I'm questioning myself, and thinking that maybe it is better suited for DEs as it is.
  • Best Open source no-code ELT tool for startup
    5 projects | reddit.com/r/dataengineering | 29 Aug 2022
    Opensource, good for basic SQL and/or Python skills, extensible and provides support in setup/adoption of the framework. https://github.com/vmware/versatile-data-kit I'm the community manager for this project, I built my first full ELT pipeline (tracking GitHub stats) with no previous experience on my first month totally by myself. It's covering the full data journey. Oh, and it has Airflow integration, with that you can have a dashboard to see your jobs, dependencies but has better/more intuitive scheduling.
  • I created a pipeline extracting Reddit data using Airflow, Docker, Terraform, S3, dbt, Redshift, and Google Data Studio
    7 projects | reddit.com/r/dataengineering | 25 Jun 2022
    In order to simplify steps 1-5 I can bring another framework to your attention - Versatile Data Kit (entirely open-source) which allows you to create data jobs (being it ingestion, transformation, publishing) with SQL/ Python, which runs on any cloud and is also multi-tenant.
  • ELT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow
    2 projects | reddit.com/r/dataengineering | 24 Jun 2022
    I believe that you would not need to build the docker image yourself. There are data engineering frameworks which allow you to build your data jobs yourself and take care of the containerisation of your pipeline. You can have a look at this ingest from rest API example. They would also allow you to schedule your data job using cron, while data job itself can contain SQL & Python.
  • How-to-Guide: Contributing to Open Source
    19 projects | reddit.com/r/dataengineering | 11 Jun 2022
  • Has anyone "inherited" a pipeline/code/model that was so poorly written they wanted to quit their job?
    2 projects | reddit.com/r/datascience | 3 May 2022
    I wouldn't stay there if they absolutely disagree with changing things, it would drain my energy and I'd just get sad and depressed, on the other hand, if you decide to go for it and try to untangle this mess, I think it would contribute to the confidence, but take some real patience and persistence. I'm a real automation geek, everything that can be automated should be. Maybe if you wish for advice, I would check out this open-source DataOps / automation tool here: https://github.com/vmware/versatile-data-kit maybe it helps, maybe not, whatever you do, good luck!
  • Python or Tool for Pipelines
    2 projects | reddit.com/r/dataengineering | 9 Dec 2021
    I would recommend taking a look at Versatile Data Kit . It is an open-source tool that covers the full end-to-end cycle of data engineering with data ops practices embedded - from ingesting data from a source system, transformations (including implementation of some design patterns like Kimbal) and publishing data (for reports, apps) .
  • A note from our sponsor - Sonar
    www.sonarsource.com | 7 Feb 2023
    Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →

Stats

Basic versatile-data-kit repo stats
49
245
8.9
6 days ago
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com