data_engineering_on_gcp_book
ppp_thing
Our great sponsors
data_engineering_on_gcp_book | ppp_thing | |
---|---|---|
12 | 3 | |
116 | 7 | |
- | - | |
2.6 | 0.0 | |
about 3 years ago | about 1 year ago | |
C | ||
- | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data_engineering_on_gcp_book
-
How possible is it for a beginner to establish pipelines, data warehouse, and visualization solution as a team of 1?
This book will walk you through setting up a complete data engineering stack on GCP: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Python & SQL knowledge needed for ETL?
As for resources, this book goes over a lot of these: https://github.com/Nunie123/data_engineering_on_gcp_book. However, this goes over the 'how', not the 'why'. The only method I know for understanding the 'why' is experience. Whether at work or personal projects.
-
Learning Python and SQL: What should be my next step?
Here's a good book to follow along to introduce you to common tooling and design patterns: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Github Repo with All Data tranformation,Cleaning,Validation
I'm not sure if this is exactly what you're looking for, but here's a book on GitHub that talks about the tools and steps for building data pipelines into a data warehouse: https://github.com/Nunie123/data_engineering_on_gcp_book
-
What is the low hanging fruit for a brand new GCP data engineer to learn?
Check out this book: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Unsure about overall process of data engineering
If you're interested in example of how to build a complete data engineering infrastructure, you should check out this book: https://github.com/Nunie123/data_engineering_on_gcp_book
-
[HELP] Airflow Reverse proxy + load balancer +docker
If you want to try Airflow without the setup headache, you can try Composer on GCP, which is a hosted version of Airflow. I wrote some info on how to do that here: https://github.com/Nunie123/data_engineering_on_gcp_book/blob/master/ch_2_orchestration.md
-
Transition from a Quality engineer to Data engineer
This book might be a good resource for you: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Accepted a data engineer intern role at a Big N company - how do I learn as much as possible?
If you want a place to start on personal projects you can check out this book, https://github.com/Nunie123/data_engineering_on_gcp_book, which will walk you through the basics of setting up a full data engineering stack.
-
What tools, software, programming languages, and etc. does a data engineer need to have in 2021
If you are interested in tooling, here's a free book on setting up a basic data engineering tech stack on GCP: https://github.com/Nunie123/data_engineering_on_gcp_book
ppp_thing
-
Ask HN: Have you created programs for only your personal use?
I wrote a PPPoE client with failover so I can keep the session even when one of my gateways fails or is rebooted (this lets me do regular maintenance without interrupting my internet connection); I put it on github[1], but I doubt anyone will use it. I hope there are few people left with the scourge that is PPPoE, and my OS choice means many people would need to switch OSes to use it, so yeah. Also, I don't care to make it easy to use or to promote it, really. (I've mentioned it once or twice and did a Show HN that got less than ten votes, which I kind of expected).
I've also got my personal (network) monitoring software, some 'IoT' stuff to capture temperature and humidity data around my house, and I'm working on a ESP32 based alarm clock pulling data from iCalendar.
[1] https://github.com/russor/ppp_thing
- Show HN: PPPoE client with session handoff between redundant FreeBSD routers
-
What is your “I don't care if this succeeds” project?
I just published https://github.com/russor/ppp_thing which lets me (and maybe you) failover my PPPoE session between two FreeBSD hosts, so I can do regular maintenance without losing my IP or impacting TCP sessions.
I used to let my DSL modem handle PPPoE and NAT, so failover was easy, but found out fragmented IPv6 crashed the leased modem, and the replacement modem also sucks, so bridge mode + a custom PPPoE client (but from netgraph pieces) it is. Sadly useful in 2021, because PPPoE is somehow still a thing.
What are some alternatives?
shotcaller - A moddable RTS/MOBA game made with bracket-lib and minigene.
polybar-clockify - Control Clockify through Polybar
FactGraph - FactGraph monorepo (backend + frontend + landing page + blog)
vopono - Run applications through VPN tunnels with temporary network namespaces
beubo - Beubo is a free, simple, and minimal CMS with unlimited extensibility using plugins
place
distribyted - Torrent client with HTTP, fuse, and WebDAV interfaces. Start exploring your torrent files right away, even zip, rar, or 7zip archive contents!
meal-scheduler
go-plugin - Golang plugin system over RPC.
fingine - A personal finance simulation engine in Rust.
dali - Indie assembler/linker for Dalvik VM .dex & .apk files (Work In Progress)
scraper - Nodejs web scraper. Contains a command line, docker container, terraform module and ansible roles for distributed cloud scraping. Supported databases: SQLite, MySQL, PostgreSQL. Supported headless clients: Puppeteer, Playwright, Cheerio, JSdom.