metricflow VS dbt_metrics

Compare metricflow vs dbt_metrics and see what are their differences.

metricflow

MetricFlow allows you to define, build, and maintain metrics in code. (by dbt-labs)

dbt_metrics

Macros for calculating metrics (by dbt-labs)
Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
metricflow dbt_metrics
4 1
1,073 208
2.5% 1.4%
9.8 5.1
4 days ago 6 months ago
Python Python
GNU General Public License v3.0 or later Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

metricflow

Posts with mentions or reviews of metricflow. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-04-06.
  • MetricFlow allows you to define, build, and maintain metrics in code.
    1 project | /r/CKsTechNews | 6 Apr 2022
  • Show HN: MetricFlow – open-source metric framework
    4 projects | news.ycombinator.com | 6 Apr 2022
    Three things:

    First, MetricFlow does not currently support MySQL. We launched with support for BigQuery, Redshift, and Snowflake. I have opened an issue to add support for MySQL (and similar issues for other SQL engines are coming): https://github.com/transform-data/metricflow/issues/27

    Second, what we call a data source is more similar to a table in a database, rather than the underlying database service itself. Metricflow itself is useful when you're using a single SQL engine - indeed, that's all we support today - but it is most useful when you're in a world where joins are a thing. That said, if you have one big data table you might still find it useful to have declarative metric definitions defined in Metricflow. Suppose, for example, you had a big NoSQL style table filled with JSON objects. You might define a few data sources that normalize those JSON objects into top level elements (identifiers, dimensions, aggregated measures) using the sql_query data source config attribute, and then that'd allow you to support structured queries on the data consumption end while pushing unstructured blobs from your application layer. This will be slow at query time, and only as reliable as the level of discipline exerted in your application development workflow, but it's possible.

    Third, if we did support MySQL you'd basically connect to it via standard connection parameters - we have a config file where you can store the required information and then we'll manage the connections for you. However, I'm not familiar with uxwizz, and a quick perusal of their documentation did not turn up how one goes about connecting to the underlying DB. It's likely I just missed this, but at any rate I don't know how it is done. If they don't support standard MySQL client connections you'd need to write an adapter of some kind against whatever DB connection APIs they provide, in which case you'd likely need to roll a custom implementation of MetricFlow's SqlClient interface and initialize the MetricFlowEngine with that.

dbt_metrics

Posts with mentions or reviews of dbt_metrics. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-04-06.
  • Show HN: MetricFlow – open-source metric framework
    4 projects | news.ycombinator.com | 6 Apr 2022
    If you’re interested, the longer version:

    Semantics

    MetricFlow has a less configuration relative to these other frameworks. We accomplish this by choosing abstractions that allow us to handle more on our side at query time through the DataFlow Plan builder. Working with the SQL constructions as a dataflow enables extensions such as non-dw data sources, or using other languages(Python) for some transformations.

    The dbt spec is relatively new and requires a few extremely unDRY expressions. The most obvious is the lack of support for joins which means you simply won’t be able to answer most questions unless you build huge tables. There are a few other issues with the abstractions. For example, dimensions are defined multiple times across metrics. A few folks posted more about these challenges in their Github Issue but they’re sticking to their spec. I’m skeptical it will work at any scale.

    The Cube concept is similar to Explores in Looker. They’re limiting because you end up with a bunch of representations of small domains within the warehouse and the moment you hit the edge of that domain you need to add a new Cube/Explore. This is not DRY and it’s frustrating. There is also no first-class object for Metrics which means you’re limited to to relatively simple metric types.

    Performance

    MetricFlow has the flexibility of the DataFlow Plan Builder and builds quite efficient queries. The Materialization feature allows you to build roll up tables programmatically to the data warehouse which could then be used as a low-latency serving layer.

    dbt is a jinja macro and generates a static query per metric requested: [https://github.com/dbt-labs/dbt_metrics/blob/main/macros/get.... This macro will be quite hard to optimize for more complicated metric types. We struggled a ton with this before refactoring our framework to allow the manipulation and optimizations of these DataFlow Plans.

    Cube is pretty slick on caching, but I know less about their query optimizations. They have some awesome pre-aggregation and caching features. I think this comes from their background in serving frontend interfaces.

    Interfaces

    MetricFlow supports a Python SDK and our CLI, today. Transform has a few more interfaces (SQL over JDBC, GraphQL, React) that sit outside the scope of this OSS project.

    dbt only builds a query in the dbt context today. TBD what the dbt server does but I imagine it will expose a JDBC for paying customers.

    Cube seems more focused on building custom data applications but has recently pivoted to the analytics front. I haven’t seen those interfaces in action but I’m curious to learn more there.

What are some alternatives?

When comparing metricflow and dbt_metrics you can also consider the following projects:

dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Apache Superset - Apache Superset is a Data Visualization and Data Exploration Platform [Moved to: https://github.com/apache/superset]

dictum - Describe business metrics with YAML, query and visualize in Jupyter with zero SQL

dbt - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. [Moved to: https://github.com/dbt-labs/dbt-core]

datafluent_pg - Build a better understanding of your data in PostgreSQL.