Open Source Analytics Stack: Bringing Control, Flexibility, and Data-Privacy to Your Analytics

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. Matomo

    Empowering People Ethically 🚀 — Matomo is hiring! Join us → https://matomo.org/jobs Matomo is the leading open-source alternative to Google Analytics, giving you complete control and built-in privacy. Easily collect, visualise, and analyse data from websites & apps. Star us on GitHub ⭐️ – Pull Requests welcome!

    Matomo (website, GitHub) is an open-source web analytics tool and calls itself a Google Analytics alternative. Matomo gives you valuable insights into your website's visitors, marketing campaigns, etc., making it easy to optimize your strategy and online experience of your visitors.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. dbt

    Discontinued dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. [Moved to: https://github.com/dbt-labs/dbt-core]

    Due to the rise in cloud-based data warehouses, businesses can directly load all the raw data into the data warehouse without prior transformations. This process is known as ELT (Extract, Load, Transform) and gives data and analytics teams freedom to develop ad-hoc transformations based on their particular needs. ELT became popular as the cloud's processing power and scale became better suited to transforming data. DBT (website, GitHub) is a popular open-source tool recommended for ELT and allows businesses to transform data in their warehouses more effectively. It's a great pairing with with RudderStack's Cloud Extract ETL tool.

  4. PostgreSQL

    Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch

    Moreover, using open-source warehouse tools can allow unlocking additional insights from your data in real-time and at a lesser cost. PostgreSQL (website, repo) is a popular example of an efficient and low-cost data warehousing solution. Another example is ClickHouse (website, GitHub), an open-source, analytics-focused DBMS that allows generating analytical reports from data in real-time using SQL.

  5. Apache Kafka

    Mirror of Apache Kafka

    With the increase in real-time data streams and event streams, certain use cases emerged that require access to real-time data such as financial services risk reporting or detecting a credit card fraud. Real-time streams can be obtained using a stream processing framework like Apache Kafka (website, GitHub). The focus is to direct the stream of data from various sources into reliable queues where data can be automatically transformed, stored, analyzed and reported concurrently.

  6. superset

    Apache Superset is a Data Visualization and Data Exploration Platform

    Open-source BI platforms such as Metabase (website, GitHub) and Apache SuperSet (website, GitHub) are easy to deploy without IT involvement. Metabase lets you build dashboards from the data in your warehouse easily, with no SQL, or, if you have data engineering or science know-how, inside more powerful and flexible notebooks or with SQL itself. Similarly, Apache SuperSet helps businesses explore and visualize data from simple line charts to detailed geospatial charts.

  7. unomi

    Apache Unomi

    Talking about successful data ingestion tools, most businesses rely increasingly on different Customer Data Platforms (CDPs) that track, collect, and ingest data from multiple sources and systems into a single platform to get a unified customer view. Apache Unomi (website, GitHub) is a perfect example of an open-source CDP that ingests data and collects it in one place.

  8. Snowplow

    The leader in Next-Generation Customer Data Infrastructure

    However, limitations to traditional CDPs, especially around connecting to best-of-breed customer tooling and exposing data for use across an organization have driven a new generation of non-CDPs. Solutions like Snowplow's (website, GitHub) data delivery platform and RudderStack's (website, GitHub) customer data platform for developers ingest data from a multitude of sources, apply in-stream transformations, and route data to your data warehouse, like Snowplow, or your warehouse plus your preferred customer tooling destinations for activation, like RudderStack.

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. Rudderstack

    Privacy and Security focused Segment-alternative, in Golang and React

    However, limitations to traditional CDPs, especially around connecting to best-of-breed customer tooling and exposing data for use across an organization have driven a new generation of non-CDPs. Solutions like Snowplow's (website, GitHub) data delivery platform and RudderStack's (website, GitHub) customer data platform for developers ingest data from a multitude of sources, apply in-stream transformations, and route data to your data warehouse, like Snowplow, or your warehouse plus your preferred customer tooling destinations for activation, like RudderStack.

  11. rudderstack-docs

    Documentation repository for RudderStack - the Customer Data Platform for Developers.

    However, limitations to traditional CDPs, especially around connecting to best-of-breed customer tooling and exposing data for use across an organization have driven a new generation of non-CDPs. Solutions like Snowplow's (website, GitHub) data delivery platform and RudderStack's (website, GitHub) customer data platform for developers ingest data from a multitude of sources, apply in-stream transformations, and route data to your data warehouse, like Snowplow, or your warehouse plus your preferred customer tooling destinations for activation, like RudderStack.

  12. ClickHouse

    ClickHouse® is a real-time analytics database management system

    Moreover, using open-source warehouse tools can allow unlocking additional insights from your data in real-time and at a lesser cost. PostgreSQL (website, repo) is a popular example of an efficient and low-cost data warehousing solution. Another example is ClickHouse (website, GitHub), an open-source, analytics-focused DBMS that allows generating analytical reports from data in real-time using SQL.

  13. PostHog

    🦔 PostHog provides open-source web & product analytics, session recording, feature flagging and A/B testing that you can self-host. Get started - free.

    The self-hosted PostHog (website, GitHub) is an excellent open-source alternative for product analytics and can be easily integrated into your infrastructure. You can easily analyze how customers interact with your product, the user traffic, and ways to improve your user retention.

  14. Metabase

    The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:

    Open-source BI platforms such as Metabase (website, GitHub) and Apache SuperSet (website, GitHub) are easy to deploy without IT involvement. Metabase lets you build dashboards from the data in your warehouse easily, with no SQL, or, if you have data engineering or science know-how, inside more powerful and flexible notebooks or with SQL itself. Similarly, Apache SuperSet helps businesses explore and visualize data from simple line charts to detailed geospatial charts.

  15. Countly

    Countly is a product analytics platform that helps teams track, analyze and act-on their user actions and behaviour on mobile, web and desktop applications.

    Countly (website, GitHub) is also an open-source product analytics platform that is designed primarily for marketing organizations. It helps marketers track website information (website transactions, campaigns, and sources that led visitors to the website, etc.). Countly also collects real-time mobile analytics metrics like active users, time spent in-app, customer location, etc., in a unified view on your dashboard.

  16. Apache Superset

    Discontinued Apache Superset is a Data Visualization and Data Exploration Platform [Moved to: https://github.com/apache/superset]

    Open-source BI platforms such as Metabase (website, GitHub) and Apache SuperSet (website, GitHub) are easy to deploy without IT involvement. Metabase lets you build dashboards from the data in your warehouse easily, with no SQL, or, if you have data engineering or science know-how, inside more powerful and flexible notebooks or with SQL itself. Similarly, Apache SuperSet helps businesses explore and visualize data from simple line charts to detailed geospatial charts.

  17. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • RisingWave Turns Four: Our Journey Beyond Democratizing Stream Processing

    7 projects | dev.to | 18 Apr 2025
  • Apache Superset

    14 projects | news.ycombinator.com | 26 Feb 2024
  • Framework Laptops are now Thunderbolt 4 certified

    5 projects | news.ycombinator.com | 7 Nov 2022
  • Show HN: GravaMetrics – Powerful Dashboards made simple

    3 projects | news.ycombinator.com | 30 Jan 2022
  • Best Dashboard Advice

    3 projects | /r/BusinessIntelligence | 24 Oct 2021

Did you know that Python is
the 2nd most popular programming language
based on number of references?