The Data Trinity

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • DBT-Bon-Voyage

    Learning DBT

  • This component of the trinity is mainly involved in processing your data according to your business needs and also storing the data. The transform component is where business logic is domiciled. This is where you perform data transformations to generate business value from data. They could be as simple as filters and as complex as joins, rolling column computation, pivoting, etc. Most transformations will be done in SQL. This is because of its ability to perform data processing and also compatibility with the storage systems. Most of the storage systems will be designed to be SQL compatible and hence, SQL as the transformation language is the most common. A common tool in this space is Data Build Tool DBT A powerful tool that supercharges SQL by introducing other programming paradigms such as jinja templating, source definitions, hooks, variables and sanity checks/tests among others. The alternative would be stored procedures and SQL scripts but you would be losing out on all the great features of DBT. I have had a go at it in my personal github which you can check over here. DBT Bon Voyage Fivetran also offers an SQL based transformation interface where one can run their SQL scripts.

  • dbt-core

    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

  • This component of the trinity is mainly involved in processing your data according to your business needs and also storing the data. The transform component is where business logic is domiciled. This is where you perform data transformations to generate business value from data. They could be as simple as filters and as complex as joins, rolling column computation, pivoting, etc. Most transformations will be done in SQL. This is because of its ability to perform data processing and also compatibility with the storage systems. Most of the storage systems will be designed to be SQL compatible and hence, SQL as the transformation language is the most common. A common tool in this space is Data Build Tool DBT A powerful tool that supercharges SQL by introducing other programming paradigms such as jinja templating, source definitions, hooks, variables and sanity checks/tests among others. The alternative would be stored procedures and SQL scripts but you would be losing out on all the great features of DBT. I have had a go at it in my personal github which you can check over here. DBT Bon Voyage Fivetran also offers an SQL based transformation interface where one can run their SQL scripts.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • nodejs-bigquery

    Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

  • The storage component is simply where the data lives after extraction and loading. The most common artifact is a data warehouse or data lake depending on conceptual design. The most common services in this category are AWS Redshift, Google cloud Big query and Snowflake.

  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • If your company is price sensitive, you might opt for a free open source tool such as Airbyte but absorb the engineering costs of setup and maintenance. If your company wishes to focus their engineering resources on other tasks other than extraction and loading, they can use a managed service like Fivetran where they pay for the service.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • What are your thoughts on projects using the Elastic License?

    2 projects | /r/opensource | 26 Jan 2023
  • Two-way syncs across your data stack and SaaS tools

    1 project | news.ycombinator.com | 24 Jan 2023
  • Airbyte Source Connectors performance bottelneck

    2 projects | /r/dataengineering | 15 Jan 2023
  • Data Pipeline: From ETL to EL plus T

    2 projects | dev.to | 8 Jan 2023
  • Continuously import Aurora MySQL data into BigQuery

    1 project | /r/dataengineering | 3 Jan 2023