etl-markup-toolkit VS mara-pipelines

Compare etl-markup-toolkit vs mara-pipelines and see what are their differences.

etl-markup-toolkit

ETL Markup Toolkit is a spark-native tool for expressing ETL transformations as configuration (by leozqin)

mara-pipelines

A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow (by mara)
Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
etl-markup-toolkit mara-pipelines
7 3
5 2,054
- 0.4%
0.0 6.0
about 3 years ago 4 months ago
Python Python
MIT License MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

etl-markup-toolkit

Posts with mentions or reviews of etl-markup-toolkit. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-08-22.

mara-pipelines

Posts with mentions or reviews of mara-pipelines. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-08-22.

What are some alternatives?

When comparing etl-markup-toolkit and mara-pipelines you can also consider the following projects:

PySpark-Boilerplate - A boilerplate for writing PySpark Jobs

abcd-hcp-pipeline - bids application for processing functional MRI data, robust to scanner, acquisition and age variability.

quinn - pyspark methods to enhance developer productivity 📣 👯 🎉

kuwala - Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times

sparkmagic - Jupyter magics and kernels for working with remote Spark clusters

pybaseball - Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)

tdigest - t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark

dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dremio-oss - Dremio - the missing link in modern data

airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

citus - Distributed PostgreSQL as an extension

sgr - sgr (command line client for Splitgraph) and the splitgraph Python library