dbt-utils VS superset

Compare dbt-utils vs superset and see what are their differences.

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
dbt-utils superset
7 138
1,229 59,724
2.5% 1.5%
6.3 10.0
6 days ago 3 days ago
Python TypeScript
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

dbt-utils

Posts with mentions or reviews of dbt-utils. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-08.
  • Show HN: Nasty, a cross warehouse, type checked, unit testable analytics library
    2 projects | news.ycombinator.com | 8 Mar 2024
    // To get around this, we can use the approach outlined by how dbt does ansi sql generate_series

      // https://github.com/dbt-labs/dbt-utils/blob/main/macros/sql/generate_series.sql
  • Anything one should know before going for self-hosted dbt?
    1 project | /r/dataengineering | 10 Jul 2023
    I got bit by dbt-utils/deduplicate naively removing any row that contained a null in it recently, but fortunately there was a workaround for Databricks and a few other flavors of SQL.
  • Managing SQL Tests
    2 projects | /r/dataengineering | 30 Mar 2023
    I'm used to utilising dbt and defining my tests there (along with dbt-utils or https://github.com/calogica/dbt-expectations): I simply add a list item to a column definition and can already define a great number of tests without having to copy code. I can even extend the pre-defined using generic tests. Writing custom tests also integrates nicely. Additionally it's very convenient to tag tests or define a severity. The learning curve for a business engineer is almost flat as long as they know some SQL.
  • Dbt to acquire Transform to build out its semantic layer
    2 projects | news.ycombinator.com | 9 Feb 2023
    My top three:

    - Dev/stag/prod env check numbers before pushing to production.

    - Unions between two sources that are not the same shape can be done without the headache. https://github.com/dbt-labs/dbt-utils#union_relations-source

    - Macros for common case when statements.

  • Analytics Stacks for Startups
    8 projects | dev.to | 21 Feb 2022
    Add tests: unit tests in SQL are still not really practical, but testing the data, before allowing users to see it, is possible. dbt has some basic tests like Non-NULL and so on. dbt_utils supports comparing data across tables. If you need more, there is Great Expectation and similar tools. dbt also supports writing SQL queries which output “bad” rows. Use this to, e.g. check a specific order against manually checked correct data. Tests give you confidence that your pipelines produce correct results: nothing is worse than waking up with a Slack message from your boss that the graphs look wrong… They are especially useful in case you have to refactor a data pipeline. Basically every query you would run during the QA phase of a change request has a high potential to become an automatic test.
  • Why is Data Build Tool (DBT) is so popular? What are some other alternatives?
    4 projects | /r/dataengineering | 4 Dec 2021
  • Unit testing SQL in DBT
    3 projects | /r/dataengineering | 6 Feb 2021
    The equality test macro is also in the dbt-utils package from fishtown at https://github.com/fishtown-analytics/dbt-utils/blob/master/macros/schema_tests/equality.sql

superset

Posts with mentions or reviews of superset. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-26.
  • Apache Superset
    14 projects | news.ycombinator.com | 26 Feb 2024
    Superset is absolutely phenomenal. I really hope Microsoft eventually releases all of their customizations they made to it internally to the OS community someday.

    https://www.youtube.com/watch?v=RY0SSvSUkMA

    https://github.com/apache/superset/discussions/20094

  • A modern data stack for startups
    2 projects | news.ycombinator.com | 30 Dec 2023
    I recently ran a little shootout between Superset, Metabase, and Lightdash. All have nontrivial weaknesses but I ended up picking Lightdash.

    Superset the best of them at _data visualization_ but I honestly found it almost useless for self-serve _BI_ by business users. This issue on how to do joins in Superset (with stalebot making a mess XD) is everything difficult about Superset for BI in a nutshell. https://github.com/apache/superset/issues/8645

    Metabase is pretty great and it's definitely the right choice for a startup looking to get low cost BI set up. It still has a very table centric view, but feels built for _BI_ rather than visualization alone.

    Lightdash has significant warts (YAML, pivoting being done in the frontend, no symmetric aggregates) but the Looker inspiration is obvious and it makes it easy to present _groups of tables_ to business users ready to rock. I liked Looker before Google acquired it. My business users are comfortable with star and snowflake schemas (not that they know those words) and it was easy to drop Lightdash on top of our existing data warehouse.

  • FLaNK Stack Weekly for 20 Nov 2023
    37 projects | dev.to | 20 Nov 2023
  • Hiding tokens retrieved via API from the html source?
    1 project | /r/dotnet | 4 Nov 2023
  • Yandex open sourced it's BI tool DataLens
    4 projects | news.ycombinator.com | 26 Sep 2023
    Or like not being able to delete a user without running some SQL:

    https://github.com/apache/superset/issues/13345

    Almostl instantly run into this issue setting up a test instance of Superset. And the issue has been around for years.

  • Apache Superset Is a Data Visualization and Data Exploration Platform
    1 project | news.ycombinator.com | 11 Sep 2023
  • Apache Superset: Installing locally is easy using the makefile
    3 projects | dev.to | 20 Aug 2023
    Are you interested in trying out Superset, but you're intimidated by the local setup process? Worry not! Superset needs some initial setup to install locally, but I've got a streamlined way to get started - using the makefile! This file contains a set of scripts to simplify the setup process.
  • More public SQL-queryable databases?
    3 projects | /r/datasets | 10 Jul 2023
    Recently I discovered BigQuery public datasets - just over 200 datasets available for directly querying via SQL. I think this is a great thing! I can connect these direct to an analytics platform (we use Apache Superset which uses Python SQLAlchemy under the hood) for example and just start dashboarding.
  • How useful is SQL for managers?
    1 project | /r/learnprogramming | 24 Jun 2023
    if they don't want to pay for powerbi, can try something like https://superset.apache.org/
  • Real-time data analytics with Apache Superset, Redpanda, and RisingWave
    3 projects | dev.to | 20 May 2023
    In today's fast-paced data-driven world, organizations must analyze data in real-time to make timely and informed decisions. Real-time data analytics enables businesses to gain valuable insights, respond to real-time events, and stay ahead of the competition. Also, the analytics engine must be capable of running analytical queries and returning results in real-time. In this article, we will explore how you can build a real-time data analytics solution using the open-source tools Redpanda a distributed streaming platform, Apache Superset, a data visualization, and a business intelligence platform, combined with RisingWave a streaming database.

What are some alternatives?

When comparing dbt-utils and superset you can also consider the following projects:

dbt-expectations - Port(ish) of Great Expectations to dbt test macros

streamlit - Streamlit — A faster way to build and share data apps.

sqlfluff - A modular SQL linter and auto-formatter with support for multiple dialects and templated code.

jupyter-dash - OBSOLETE - Dash v2.11+ has Jupyter support built in!

dbt-oracle - A dbt adapter for oracle db backend

Apache Hive - Apache Hive

nodejs-bigquery - Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

lightdash - Self-serve BI to 10x your data team ⚡️

Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:

django-project-template - The Django project template I use, for installation with django-admin.

react-admin - A frontend Framework for building data-driven applications running on top of REST/GraphQL APIs, using TypeScript, React and Material Design