nfcompose
mara-pipelines
nfcompose | mara-pipelines | |
---|---|---|
6 | 3 | |
32 | 2,053 | |
- | 0.1% | |
8.9 | 6.0 | |
19 days ago | 5 months ago | |
Python | Python | |
Mozilla Public License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
nfcompose
-
Implementing system-versioned tables in Postgres
I have implemented this for our tool NF Compose that allows us to build REST APIs without writing a single line of code [0]. I didn't go the route of triggers because we generate database tables automatically and we used to have a crazy versioning scheme that was inspired by data vault and anchor modelling where we stored every change on every attribute as a new record.
Sounded cool, but in practice it was really slow. The techniques that are usually employed by Data Vault to fix this issue seemed too complex. Over time we moved to an implementation that handles the historization dynamically at runtime by generating SQL queries ourselves [1]. On a sidenote: Generating SQL in python sounds dangerous, but we spent a lot of time on making it secure. We even have a linter that checks that everything is escaped properly whenever we are in dev mode [2]
[0] https://github.com/neuroforgede/nfcompose/
- Show HN: NF Compose – An API to Build/Generate REST APIs
- GitHub - neuroforgede/nfcompose: Build REST APIs/Integrations in minutes instead of hours
- GitHub - neuroforgede/nfcompose: NF Compose is a (data) integration platform that allows developers to define REST APIs in seconds instead of hours. Generated REST APIs are backed by postgres and support automatic consumer notifications on data changes out of the box.
-
NF Compose – define REST APIs in seconds instead of hours
As part of our services we also provide support for building integrations between (our) systems and external systems. As we didn't want to keep building the same REST APIs every time, we set out to build a standardized data integratin platform that allows for a quick way to generate user specified REST API definitions via a REST API. This has become NF Compose (https://github.com/neuroforgede/nfcompose).
- Show HN: NF Compose – define REST APIs in seconds instead of minutes
mara-pipelines
-
How to keep track of the different Transformations done in an ETL pipeline?
The closest I've found is Mara but not what I'm after.
-
Using PostgreSQL as a Data Warehouse
The tooling behind the approach has been built as a set of python package named Mara. It is available at GitHub:
https://github.com/mara/mara-pipelines
And additional packages can be found at the Mara org:
https://github.com/mara
-
Build your own “data lake” for reporting purposes
Minio and nifi, require machines by themselves. Better off pure python and if obe wants sonething lighweight and visually pleasing Mara [0] or Dagster with Dagit [1] will do the job
[0] https://github.com/mara/mara-pipelines
[1] https://docs.dagster.io/tutorial/execute
What are some alternatives?
retake - PostgreSQL for Search [Moved to: https://github.com/paradedb/paradedb]
abcd-hcp-pipeline - bids application for processing functional MRI data, robust to scanner, acquisition and age variability.
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
kuwala - Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
sgr - sgr (command line client for Splitgraph) and the splitgraph Python library
pybaseball - Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
frappe - Low code web framework for real world applications, in Python and Javascript
dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
nfcompose-examples
etl-markup-toolkit - ETL Markup Toolkit is a spark-native tool for expressing ETL transformations as configuration
webhooks-bridge - A simple webhook receiver that filters, transforms and forwards webhooks
dremio-oss - Dremio - the missing link in modern data