dagster
Rudderstack
Our great sponsors
dagster | Rudderstack | |
---|---|---|
46 | 83 | |
9,939 | 3,897 | |
4.7% | 1.3% | |
10.0 | 9.8 | |
5 days ago | about 22 hours ago | |
Python | Go | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dagster
-
The Dagster Master Plan
I found this example that helped me - https://github.com/dagster-io/dagster/tree/master/examples/project_fully_featured/project_fully_featured
In the meantime, we're collecting solutions and use cases in our GitHub Discussions, and you're welcome to ask any specific questions in there!
-
What are some open-source ML pipeline managers that are easy to use?
I would recommend the following: - https://www.mage.ai/ - https://dagster.io/ - https://www.prefect.io/ - https://metaflow.org/ - https://zenml.io/home
-
Best Orchestration Tool to run dbt projects?
Dagster seemed really cool when I looked into it as an alternative to airflow. I especially like the software defined assets and built-in lineage which I haven't seen in any other tool. However it seems it does not support RBAC which is a pretty big issue if you want a self-service type of architecture, see https://github.com/dagster-io/dagster/issues/2219. It does seem like it's available in their hosted version, but I wanted to run it myself on k8s.
-
dbt Cloud Alternatives?
Dagster? https://dagster.io
-
What's the best thing/library you learned this year ?
One that I haven't seen on here yet: dagster
- Can we take a moment to appreciate how much of dataengineering is open source?
-
Dagger Python SDK: Develop Your CI/CD Pipelines as Code
I wondered how it related to https://dagster.io/
-
Data Engineer Github Profile?
You can find all current, closed, and resolved issues on the “Issues” section and explore them using filters: eg issues for dagster. Look into some of the issues and feel free to ask a question or post your idea: it’s much less toxic here (compared to SO, for example).
-
[D] Should I go with Prefect, Argo or Flyte for Model Training and ML workflow orchestration?
You could also consider Dagster, which aims to improve Apache Airflow's shortcomings. Also, take a look at MyMLOps, where you can get a quick overview of open-source orchestration tools.
Rudderstack
-
Google Analytics 4 Has Me So Frustrated, We Built Our Own Analytics Service
In bigger setups, all you want is a data collector and router so that you can feed the data into multiple destinations, depending on the use case. Analytics is just one. Example: https://www.rudderstack.com/ & https://www.rudderstack.com/replace-google-analytics-4-guide...
-
I want to contribute to open source but don't know where to start
Check out RudderStack, a Go project to build data pipeline. Our slack is quite active. The best way to contribute is by creating a new integration with your favorite tool. You do not need to rely to too much on existing knowledge about inner workings of the project to do so, so it is beginner friendly.
-
Writing few lines of open-source js/python code can get ₹8k-80k. Is it a good reward for an oss challenge? Last day, more prizes than the participants until now :)
I thought, start the challenge, they will come. I was wrong. This is the last day of the Transformations challenge at RudderStack. If you compare the # of submissions with the # of prizes, there are good chances that your submission may get a prize (lowest ₹8k and highest ₹80k).
The challenge is over. Winners have been announced. When we are ready for the next one, will announce on RudderStack GitHub repo
-
Project showcase: sample Data Lakehouse
Super. This is amazing. Sharing your project with the community. If you get a chance, try out RudderStack to build your pipeline.
-
Entire tech industry right now
P.S. If you're here, do support open-source project - RudderStack
RudderStack?
-
Go doesn’t do any magical stuff and I love that
RudderStack (open-source event streaming, alternative to Segment) processed 1 trillion+ events last year and has 400+ integrations with different services. It would have been nightmare if it was not built in Go.
-
Interview Prep - Senior Data Integration role
RudderStack, dbt, Kafka, Headless CDP, etc. on top of my mind
-
Can we take a moment to appreciate how much of dataengineering is open source?
It takes a village to build an open-source project. Grateful to 170+ contributors who contributed to RudderStack
What are some alternatives?
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Mage - 🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
MLflow - Open source platform for the machine learning lifecycle
Snowplow - The enterprise-grade behavioral data engine (web, mobile, server-side, webhooks), running cloud-natively on AWS and GCP
PostHog - 🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
meltano
OpenLineage - An Open Standard for lineage metadata collection
streamlit - Streamlit — A faster way to build and share data apps.
Socioboard - Socioboard is world's first and open source Social Technology Enabler. Socioboard Core is our flagship product.
ploomber - The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️