sayn
data-engineering-wiki
Our great sponsors
sayn | data-engineering-wiki | |
---|---|---|
2 | 15 | |
117 | 1,027 | |
0.0% | 5.7% | |
6.8 | 7.9 | |
3 days ago | 28 days ago | |
Python | CSS | |
Apache License 2.0 | Creative Commons Zero v1.0 Universal |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sayn
-
Average reply times from some of my Facebook friends over the last few years [OC], full article here: https://medium.com/@timsugaipov/taking-your-facebook-messenger-data-further-f9da079b1409?source=friends_link&sk=3bd04bb35ad9a4b6f586300e52f96e4f
Data Processing: SAYN
-
Introducing SAYN: A Simple Yet Powerful Data Processing Framework.
We believe simplicity to be crucial when maintaining pipelines at scale. However, we also believe that simplicity should not come at the expense of flexibility. This is why we have built our own open source data processing framework: SAYN. SAYN is designed to empower analytics teams by being simple, flexible and centralised. It democratises the contribution to data processes within an analytics team, enables full flexibility and helps save a lot of time through automation.
data-engineering-wiki
- Data Engineering Glossary
-
ETL practice
My suggestions: 1. Browse https://dataengineering.wiki/ and overall go over r/dataengineering 2. In mid-sized companies, the trend is to outsource Extract and Load to providers like Fivetran or Airbyte (open-source). Then Transform it with dbt in a data warehouse with SQL. 3. In big companies, you won't touch much ETL design. Just need to be proficient in Python / Spark / SQL... 4. Make sure you know what a star schema, fact tables, and dimension tables are.
- Anything else to read
-
Looking for blogs for backend development
Hi everyone! As mentioned in title I recently came across great blogs for data engineering: startdataengineering.com and dataengineering.wiki
-
DE- How to get my foot in the door?
The data engineering subreddit maintains a wiki of advice, resources, and recommendations at https://dataengineering.wiki/. Your question is answered in their FAQ here
- Getting into Data Engineering and more!
-
Are there avenues into sports science as a software engineer or web dev?
Data engineering
-
Switching to something more technical
r/dataengineering has a wiki at https://dataengineering.wiki and also a Discord server which is pretty active.
-
Data Engineering Concepts: Definitions, Backlinks, and Graph View
Almost the same as the wiki https://dataengineering.wiki/
-
dataengineering.wiki Bug
Hi, would you mind opening an issue on GitHub? We can help you debug the issue there.
What are some alternatives?
dbt-databricks - A dbt adapter for Databricks.
glossary - Data Glossary 🧠: An interactive digital garden for deeper data exploration. Learn through a graph and backlinks, enabling layered knowledge discovery.
dataform - Dataform is a framework for managing SQL based data operations in BigQuery
Dataplane - Dataplane is a data platform that makes it easy to construct a data mesh with automated data pipelines and workflows.
tinvois-parser - Extract receipt info
versatile-data-kit - One framework to develop, deploy and operate data workflows with Python and SQL.
beneath - Beneath is a serverless real-time data platform ⚡️
quartz - 🌱 a fast, batteries-included static-site generator that transforms Markdown content into fully functional websites
yaetos - Write data & AI pipelines in (SQL, Spark, Pandas) and deploy to the cloud, simplified
Mage - 🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
dbt - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications. [Moved to: https://github.com/dbt-labs/dbt-core]
Hugo - The world’s fastest framework for building websites.