versatile-data-kit VS dbt-core

Compare versatile-data-kit vs dbt-core and see what are their differences.

dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. (by dbt-labs)
Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
versatile-data-kit dbt-core
52 86
406 8,718
3.4% 6.1%
9.8 9.7
1 day ago 4 days ago
Python Python
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

versatile-data-kit

Posts with mentions or reviews of versatile-data-kit. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-23.
  • Can we take a moment to appreciate how much of dataengineering is open source?
    8 projects | /r/dataengineering | 23 Nov 2022
    Free, Python+SQL ELT pipelines framework with orchestration functionality https://github.com/vmware/versatile-data-kit
    8 projects | /r/dataengineering | 23 Nov 2022
    If you wish to contribute, projects usually have good first issues: https://github.com/vmware/versatile-data-kit/labels/good%20first%20issue If you wish to learn, check out examples: https://github.com/vmware/versatile-data-kit/tree/main/examples
  • DE Open Source
    2 projects | /r/dataengineering | 13 Nov 2022
    Versatile Data Kit is a framework to bBuild, run and manage your data pipelines with Python or SQL on any cloud https://github.com/vmware/versatile-data-kit here's a list of good first issues: https://github.com/vmware/versatile-data-kit/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22 Join our slack channel to connect with our team: https://cloud-native.slack.com/archives/C033PSLKCPR
  • What is a personality type of a Data Engineer?
    2 projects | /r/dataengineering | 26 Oct 2022
    Okay, I will explain what I am doing and how I see the "fun" in the project. I work with an open-source framework for data engineers. The community members are developers and people who use the tool - DEs. Indeed, I am facilitating a monthly community meeting for everyone to meet and discuss important topics, but that's the only part that takes their direct time, and it's totally voluntary, so DEs usually don't join, but I'm glad that the developers are joining and participating. What I was having in mind is more of a design and promotion question. I have a vision for open source projects to have a feel of friendliness, and openness (fun) which I communicate through design and visuals that are part of the repo and information we share about the project. And, as I don't find long texts engaging, because I literally can't focus when I see a long description of, say, a GitHub repo, I have an internal struggle against very detailed descriptions. That said, I am having an internal wish to transform the project into something more like this: https://github.com/mage-ai/mage-ai Instead of this: https://github.com/vmware/versatile-data-kit But I'm questioning myself, and thinking that maybe it is better suited for DEs as it is.
  • Best Open source no-code ELT tool for startup
    5 projects | /r/dataengineering | 29 Aug 2022
    Opensource, good for basic SQL and/or Python skills, extensible and provides support in setup/adoption of the framework. https://github.com/vmware/versatile-data-kit I'm the community manager for this project, I built my first full ELT pipeline (tracking GitHub stats) with no previous experience on my first month totally by myself. It's covering the full data journey. Oh, and it has Airflow integration, with that you can have a dashboard to see your jobs, dependencies but has better/more intuitive scheduling.
  • I created a pipeline extracting Reddit data using Airflow, Docker, Terraform, S3, dbt, Redshift, and Google Data Studio
    7 projects | /r/dataengineering | 25 Jun 2022
    In order to simplify steps 1-5 I can bring another framework to your attention - Versatile Data Kit (entirely open-source) which allows you to create data jobs (being it ingestion, transformation, publishing) with SQL/ Python, which runs on any cloud and is also multi-tenant.
  • ELT of my own Strava data using the Strava API, MySQL, Python, S3, Redshift, and Airflow
    2 projects | /r/dataengineering | 24 Jun 2022
    I believe that you would not need to build the docker image yourself. There are data engineering frameworks which allow you to build your data jobs yourself and take care of the containerisation of your pipeline. You can have a look at this ingest from rest API example. They would also allow you to schedule your data job using cron, while data job itself can contain SQL & Python.
  • How-to-Guide: Contributing to Open Source
    19 projects | /r/dataengineering | 11 Jun 2022
  • Has anyone "inherited" a pipeline/code/model that was so poorly written they wanted to quit their job?
    2 projects | /r/datascience | 3 May 2022
    I wouldn't stay there if they absolutely disagree with changing things, it would drain my energy and I'd just get sad and depressed, on the other hand, if you decide to go for it and try to untangle this mess, I think it would contribute to the confidence, but take some real patience and persistence. I'm a real automation geek, everything that can be automated should be. Maybe if you wish for advice, I would check out this open-source DataOps / automation tool here: https://github.com/vmware/versatile-data-kit maybe it helps, maybe not, whatever you do, good luck!
  • Python or Tool for Pipelines
    2 projects | /r/dataengineering | 9 Dec 2021
    I would recommend taking a look at Versatile Data Kit . It is an open-source tool that covers the full end-to-end cycle of data engineering with data ops practices embedded - from ingesting data from a source system, transformations (including implementation of some design patterns like Kimbal) and publishing data (for reports, apps) .

dbt-core

Posts with mentions or reviews of dbt-core. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-09-16.

What are some alternatives?

When comparing versatile-data-kit and dbt-core you can also consider the following projects:

airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

metricflow - MetricFlow allows you to define, build, and maintain metrics in code.

n8n - Free and source-available fair-code licensed workflow automation tool. Easily automate tasks across different services.

Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

citus - Distributed PostgreSQL as an extension

dagster - An orchestration platform for the development, production, and observation of data assets.

argo-navis - Argo Navis repository for research, docs and misc items

streamlit - Streamlit — A faster way to build and share data apps.

targets - Function-oriented Make-like declarative workflows for R

nodejs-bigquery - Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

great_expectations - Always know what to expect from your data.

nbdev - Create delightful software with Jupyter Notebooks