Python Business Intelligence

Open-source Python projects categorized as Business Intelligence

Top 8 Python Business Intelligence Projects

  • Redash

    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

    Project mention: Recommend Django Great Projects | news.ycombinator.com | 2022-12-03
  • dbt-core

    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

    Project mention: Unit testing with dbt | reddit.com/r/dataengineering | 2023-01-11

    Hey OP! There are packages like dbt-datamocktool or dbt-unit-testing. You can check it out. You might want to check out this thread as well.

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

  • metricflow

    MetricFlow allows you to define, build, and maintain metrics in code.

    Project mention: Show HN: MetricFlow – open-source metric framework | news.ycombinator.com | 2022-04-06

    Three things:

    First, MetricFlow does not currently support MySQL. We launched with support for BigQuery, Redshift, and Snowflake. I have opened an issue to add support for MySQL (and similar issues for other SQL engines are coming): https://github.com/transform-data/metricflow/issues/27

    Second, what we call a data source is more similar to a table in a database, rather than the underlying database service itself. Metricflow itself is useful when you're using a single SQL engine - indeed, that's all we support today - but it is most useful when you're in a world where joins are a thing. That said, if you have one big data table you might still find it useful to have declarative metric definitions defined in Metricflow. Suppose, for example, you had a big NoSQL style table filled with JSON objects. You might define a few data sources that normalize those JSON objects into top level elements (identifiers, dimensions, aggregated measures) using the sql_query data source config attribute, and then that'd allow you to support structured queries on the data consumption end while pushing unstructured blobs from your application layer. This will be slow at query time, and only as reliable as the level of discipline exerted in your application development workflow, but it's possible.

    Third, if we did support MySQL you'd basically connect to it via standard connection parameters - we have a config file where you can store the required information and then we'll manage the connections for you. However, I'm not familiar with uxwizz, and a quick perusal of their documentation did not turn up how one goes about connecting to the underlying DB. It's likely I just missed this, but at any rate I don't know how it is done. If they don't support standard MySQL client connections you'd need to write an adapter of some kind against whatever DB connection APIs they provide, in which case you'd likely need to roll a custom implementation of MetricFlow's SqlClient interface and initialize the MetricFlowEngine with that.

  • retentioneering-tools

    Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python. Opensource analytics, predictive analytics over clickstream, sentiment analysis, AB tests, machine learning, and Monte Carlo Markov Chain simulations, extending Pandas, Networkx and sklearn.

    Project mention: My Favorite Off-the-Shelf Data Science Repos, What Are Yours? | news.ycombinator.com | 2022-06-22

    Here are my top off-the-shelf data science models for Marketing. Would be interested which other marketing data science tools you use?

    Product Recommendation on Your Website with Metarank (https://github.com/metarank/metarank)

    Metarank is a tool that helps you easily build an advanced recommendation engine for your products or content on your website. To get started you only need historical performance data of your products (e.g. number of clicks) and additional metadata like product rating, genre, ingredients or price. In a YAML file, you define the features and the model parameters (e.g. number of iterations, modeling technique). The API service integrates with Apache Flink and can be easily integrated into Kubernetes clusters.

    User Journey Analysis on your Website with Retentioneering (https://github.com/retentioneering/retentioneering-tools)

    Retentioneering helps you to understand the user journey on your website. Retentioneering is a Python library that allows you to easily connect your Google Analytics data (in Bigquery). You define user-id, event-type and time stamp. From this data input a comprehensive graph network is created with gains and losses as you know it from a customer journey. In addition, customer segments are created that have a similar customer journey. This reduces the complexity of a purely descriptive view of the data.

    Marketing Mix Modeling with Robyn (https://github.com/facebookexperimental/Robyn)

    Less third-party cookie means less attribution models. The answer to this is Marketing Mix Modeling. Marketing mix models are regression models that use statistical probability to calculate the effect size of marketing channels and other independent variables. The advantage is that business context can be modeled much more realistically. For example, Google Searches for the own brand can be integrated to determine the share of the own brand strength in the revenue. Likewise, offline advertising measures can be modeled with other metrics in this context (e.g. offline advertising with GRPs). Robyn takes into account adstock effects, ROAS calculation and multicollinarity in the marketing channels. In addition, with simple functionality, budgets can be optimized using the predictions and results from marketing tests can be integrated into the model for calibration.

  • dbt-metabase

    Model synchronization from dbt to Metabase

    Project mention: A modern data stack for startups | dev.to | 2022-04-21

    So how do we get this into Metabase? There's a tool called dbt-metabase that can infer Metabase semantic type information from the dbt schema and push it into Metabase- we run this whenever complete a dbt build, helping sync Metabase with whatever new fields we may have added.

  • prosto

    Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

    Project mention: Show HN: PRQL 0.2 – Releasing a better SQL | news.ycombinator.com | 2022-06-27

    > Joins are what makes relational modeling interesting!

    It is the central part of RM which is difficult to model using other methods and which requires high expertise in non-trivial use cases. One alternative to how multiple tables can be analyzed without joins is proposed in the concept-oriented model [1] which relies on two equal modeling constructs: sets (like RM) and functions. In particular, it is implemented in the Prosto data processing toolkit [2] and its Column-SQL language. The idea is that links between tables are used instead of joins. A link is formally a function from one set to another set.

    [1] Joins vs. Links or Relational Join Considered Harmful https://www.researchgate.net/publication/301764816_Joins_vs_...

    [2] https://github.com/asavinov/prosto data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

  • datafluent_pg

    Build a better understanding of your data in PostgreSQL.

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • dictum

    Describe business metrics with YAML, query and visualize in Jupyter with zero SQL

    Project mention: Looking for feedback on my open-source project | reddit.com/r/SQL | 2022-08-22

    TL;DR: Github repo | Documentation

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-01-11.

Python Business Intelligence related posts

Index

What are some of the best open-source Business Intelligence projects in Python? This list will help you:

Project Stars
1 Redash 22,501
2 dbt-core 6,397
3 metricflow 734
4 retentioneering-tools 591
5 dbt-metabase 273
6 prosto 65
7 datafluent_pg 28
8 dictum 17
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com