Meerschaum
Create and manage data pipes with Meerschaum. (by bmeares)
mrsm-compose-template
Bootstrap your Meerschaum Compose project with this template repository. (by bmeares)
Meerschaum | mrsm-compose-template | |
---|---|---|
17 | 1 | |
121 | 0 | |
- | - | |
6.7 | 5.1 | |
10 days ago | 10 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Meerschaum
Posts with mentions or reviews of Meerschaum.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-06-30.
-
Using SQL inside Python pipelines with Duckdb, Glaredb (and others?)
This sounds like a great use case for Meerschaum. You can organize your scripts into plugins and build out incremental transformations in SQL. We use Meerschaum Compose for client integrations and ETL in a similar workflow to yours.
-
Found a great new open source ELT Library - any pointers?
My company has been using a lot of PySpark, but we're working with not-large data (<1TB/source/day) so Spark can be a bit of overkill sometimes and I've been looking for a light-weight replacement. I think I found a replacement that fits all our needs called Meerschaum but I don't see a lot of other DEs talking about it.
-
Iām struggling with how to ask for help with my task.
Do the tables have something like a datetime or integer index column? At my job, we use the ETL Python package Meerschaum to sync our tables, and for large ones, we split the sync into chunks with --begin (inclusive) and --end (exclusive).
-
For those of you who were self taught, what was your path into data engineering
I worked as the first data engineer for a student internship for two years, during which I rewrote the system several times until I had a time-series ETL system that fit their needs perfectly. After leaving, I took what I learned and started the ETL package Meerschaum, and after a few consulting contracts to deploy Meerschaum, I landed a DE job to manage Meerschaum deployments internally. A bit unconventional but worked out as I had hoped.
-
Wanted to share my open source incremental ETL framework: Meerschaum
There's a whole lot more that you can do with the framework, but this post is getting kinda long. Please check out the project homepage for more details, and I'd really love know what y'all think! Can you see a use case for the framework in your stack?
-
Python ETL - Jupyter/Pandas/Postgresql(DW) - Project Structure and Scripting
I'm the author of the ETL framework Meerschaum which is meant for this exact purpose. You can build an ETL pipeline in a few lines of Python, e.g. here's a quick video. Check out the Getting Started guide and the docs on writing your first plugin to get your data flowing!
-
Tools that allow you to use scripts to build/maintain data pipeline
You can prototype some scripts with a tool called Meerschaum that I built for this kind of purpose. Once you're ready to deploy your prototype, you could refactor it for something more suited for enterprise like Airflow.
- Meerschaum - Data Visualization Pipelines in Minutes
mrsm-compose-template
Posts with mentions or reviews of mrsm-compose-template.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-06-11.
-
Found a great new open source ELT Library - any pointers?
Hi there, Meerschaum author here! Please see this template repository and the plugins page if you'd like to learn more. The Meerschaum Compose workflow is similar to Meltano's but more lightweight and time-series-focused. Please don't let the number of stars discourage you from trying it out!
What are some alternatives?
When comparing Meerschaum and mrsm-compose-template you can also consider the following projects:
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
glaredb - GlareDB: An analytics DBMS for distributed data
chdb - chDB is an embedded OLAP SQL Engine š powered by ClickHouse
gspreadsheet_fdw - Multicorn-based PostgreSQL foreign data wrapper for Google Spreadsheets
risingwave - SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
techslamneggs - The code for my May 3, 2023 workshop at Greenville's Tech Slam 'N Eggs!
syncx - This Meerschaum plugin implements experimental syncing methods.