Open Source Analytics Stack: Bringing Control, Flexibility, and Data-Privacy to Your Analytics

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • Matomo

    Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!

  • Matomo (website, GitHub) is an open-source web analytics tool and calls itself a Google Analytics alternative. Matomo gives you valuable insights into your website's visitors, marketing campaigns, etc., making it easy to optimize your strategy and online experience of your visitors.

  • dbt

    Discontinued dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. [Moved to: https://github.com/dbt-labs/dbt-core]

  • Due to the rise in cloud-based data warehouses, businesses can directly load all the raw data into the data warehouse without prior transformations. This process is known as ELT (Extract, Load, Transform) and gives data and analytics teams freedom to develop ad-hoc transformations based on their particular needs. ELT became popular as the cloud's processing power and scale became better suited to transforming data. DBT (website, GitHub) is a popular open-source tool recommended for ELT and allows businesses to transform data in their warehouses more effectively. It's a great pairing with with RudderStack's Cloud Extract ETL tool.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • PostgreSQL

    Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch

  • Moreover, using open-source warehouse tools can allow unlocking additional insights from your data in real-time and at a lesser cost. PostgreSQL (website, repo) is a popular example of an efficient and low-cost data warehousing solution. Another example is ClickHouse (website, GitHub), an open-source, analytics-focused DBMS that allows generating analytical reports from data in real-time using SQL.

  • Apache Kafka

    Mirror of Apache Kafka

  • With the increase in real-time data streams and event streams, certain use cases emerged that require access to real-time data such as financial services risk reporting or detecting a credit card fraud. Real-time streams can be obtained using a stream processing framework like Apache Kafka (website, GitHub). The focus is to direct the stream of data from various sources into reliable queues where data can be automatically transformed, stored, analyzed and reported concurrently.

  • superset

    Apache Superset is a Data Visualization and Data Exploration Platform

  • Open-source BI platforms such as Metabase (website, GitHub) and Apache SuperSet (website, GitHub) are easy to deploy without IT involvement. Metabase lets you build dashboards from the data in your warehouse easily, with no SQL, or, if you have data engineering or science know-how, inside more powerful and flexible notebooks or with SQL itself. Similarly, Apache SuperSet helps businesses explore and visualize data from simple line charts to detailed geospatial charts.

  • unomi

    Apache Unomi

  • Talking about successful data ingestion tools, most businesses rely increasingly on different Customer Data Platforms (CDPs) that track, collect, and ingest data from multiple sources and systems into a single platform to get a unified customer view. Apache Unomi (website, GitHub) is a perfect example of an open-source CDP that ingests data and collects it in one place.

  • Snowplow

    The enterprise-grade behavioral data engine (web, mobile, server-side, webhooks), running cloud-natively on AWS and GCP

  • However, limitations to traditional CDPs, especially around connecting to best-of-breed customer tooling and exposing data for use across an organization have driven a new generation of non-CDPs. Solutions like Snowplow's (website, GitHub) data delivery platform and RudderStack's (website, GitHub) customer data platform for developers ingest data from a multitude of sources, apply in-stream transformations, and route data to your data warehouse, like Snowplow, or your warehouse plus your preferred customer tooling destinations for activation, like RudderStack.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Rudderstack

    Privacy and Security focused Segment-alternative, in Golang and React

  • However, limitations to traditional CDPs, especially around connecting to best-of-breed customer tooling and exposing data for use across an organization have driven a new generation of non-CDPs. Solutions like Snowplow's (website, GitHub) data delivery platform and RudderStack's (website, GitHub) customer data platform for developers ingest data from a multitude of sources, apply in-stream transformations, and route data to your data warehouse, like Snowplow, or your warehouse plus your preferred customer tooling destinations for activation, like RudderStack.

  • rudderstack-docs

    Documentation repository for RudderStack - the Customer Data Platform for Developers.

  • However, limitations to traditional CDPs, especially around connecting to best-of-breed customer tooling and exposing data for use across an organization have driven a new generation of non-CDPs. Solutions like Snowplow's (website, GitHub) data delivery platform and RudderStack's (website, GitHub) customer data platform for developers ingest data from a multitude of sources, apply in-stream transformations, and route data to your data warehouse, like Snowplow, or your warehouse plus your preferred customer tooling destinations for activation, like RudderStack.

  • ClickHouse

    ClickHouse® is a free analytics DBMS for big data

  • Moreover, using open-source warehouse tools can allow unlocking additional insights from your data in real-time and at a lesser cost. PostgreSQL (website, repo) is a popular example of an efficient and low-cost data warehousing solution. Another example is ClickHouse (website, GitHub), an open-source, analytics-focused DBMS that allows generating analytical reports from data in real-time using SQL.

  • PostHog

    🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.

  • The self-hosted PostHog (website, GitHub) is an excellent open-source alternative for product analytics and can be easily integrated into your infrastructure. You can easily analyze how customers interact with your product, the user traffic, and ways to improve your user retention.

  • Metabase

    The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:

  • Open-source BI platforms such as Metabase (website, GitHub) and Apache SuperSet (website, GitHub) are easy to deploy without IT involvement. Metabase lets you build dashboards from the data in your warehouse easily, with no SQL, or, if you have data engineering or science know-how, inside more powerful and flexible notebooks or with SQL itself. Similarly, Apache SuperSet helps businesses explore and visualize data from simple line charts to detailed geospatial charts.

  • Countly

    Countly is a product analytics platform that helps teams track, analyze and act-on their user actions and behaviour on mobile, web and desktop applications.

  • Countly (website, GitHub) is also an open-source product analytics platform that is designed primarily for marketing organizations. It helps marketers track website information (website transactions, campaigns, and sources that led visitors to the website, etc.). Countly also collects real-time mobile analytics metrics like active users, time spent in-app, customer location, etc., in a unified view on your dashboard.

  • Apache Superset

    Discontinued Apache Superset is a Data Visualization and Data Exploration Platform [Moved to: https://github.com/apache/superset]

  • Open-source BI platforms such as Metabase (website, GitHub) and Apache SuperSet (website, GitHub) are easy to deploy without IT involvement. Metabase lets you build dashboards from the data in your warehouse easily, with no SQL, or, if you have data engineering or science know-how, inside more powerful and flexible notebooks or with SQL itself. Similarly, Apache SuperSet helps businesses explore and visualize data from simple line charts to detailed geospatial charts.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts