mara-pipelines
abcd-hcp-pipeline
Our great sponsors
mara-pipelines | abcd-hcp-pipeline | |
---|---|---|
3 | 1 | |
2,054 | 44 | |
0.4% | - | |
6.0 | 6.0 | |
5 months ago | 12 days ago | |
Python | Python | |
MIT License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mara-pipelines
-
How to keep track of the different Transformations done in an ETL pipeline?
The closest I've found is Mara but not what I'm after.
-
Using PostgreSQL as a Data Warehouse
The tooling behind the approach has been built as a set of python package named Mara. It is available at GitHub:
https://github.com/mara/mara-pipelines
And additional packages can be found at the Mara org:
https://github.com/mara
-
Build your own “data lake” for reporting purposes
Minio and nifi, require machines by themselves. Better off pure python and if obe wants sonething lighweight and visually pleasing Mara [0] or Dagster with Dagit [1] will do the job
[0] https://github.com/mara/mara-pipelines
[1] https://docs.dagster.io/tutorial/execute
abcd-hcp-pipeline
-
Siemens output from ABCD T1 and T2 sequences.
Who provided the sequence? They're usually the point of contact for this kind of question. Alternatively, you can bug one of the processing groups for ABCD (link, and they might point you in the right direction. A shot of getting one of the ABCD or ABIDE/HCP sequence designers to see this on reddit is unlikley, but good luck.
What are some alternatives?
kuwala - Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
papermill - 📚 Parameterize, execute, and analyze notebooks
pybaseball - Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
PyFunctional - Python library for creating data pipelines with chain functional programming
dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
fmriprep - fMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse fMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.
etl-markup-toolkit - ETL Markup Toolkit is a spark-native tool for expressing ETL transformations as configuration
Kedro - Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
dremio-oss - Dremio - the missing link in modern data
pdpipe - Easy pipelines for pandas DataFrames.
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
skull-stripping-3D-brain-mri - Apply Skulll Stripping to any 3D Brain MRI images