distribyted
data_engineering_on_gcp_book
Our great sponsors
distribyted | data_engineering_on_gcp_book | |
---|---|---|
3 | 12 | |
1,015 | 116 | |
1.5% | - | |
8.6 | 2.6 | |
2 days ago | about 3 years ago | |
Go | ||
GNU General Public License v3.0 only | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
distribyted
- Distribyted: Torrent client with on-demand file downloading as a filesystem
- Release v0.6.0-alpha3 · distribyted/distribyted · GitHub
-
What is your “I don't care if this succeeds” project?
A torrent client that exposes torrent content as files: https://github.com/distribyted/distribyted
It's pretty fun to work on it and implement new use cases. Right now it supports FUSE mounts, but I'm thinking to make it work as a WebDAV server too.
Also, I'm working on several demos, like SQLite compatibility, similar to https://github.com/lmatteis/torrent-net, or CSV analysis using Jupyter notebooks for huge datasets like https://ghtorrent.org/
data_engineering_on_gcp_book
-
How possible is it for a beginner to establish pipelines, data warehouse, and visualization solution as a team of 1?
This book will walk you through setting up a complete data engineering stack on GCP: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Python & SQL knowledge needed for ETL?
As for resources, this book goes over a lot of these: https://github.com/Nunie123/data_engineering_on_gcp_book. However, this goes over the 'how', not the 'why'. The only method I know for understanding the 'why' is experience. Whether at work or personal projects.
-
Learning Python and SQL: What should be my next step?
Here's a good book to follow along to introduce you to common tooling and design patterns: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Github Repo with All Data tranformation,Cleaning,Validation
I'm not sure if this is exactly what you're looking for, but here's a book on GitHub that talks about the tools and steps for building data pipelines into a data warehouse: https://github.com/Nunie123/data_engineering_on_gcp_book
-
What is the low hanging fruit for a brand new GCP data engineer to learn?
Check out this book: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Unsure about overall process of data engineering
If you're interested in example of how to build a complete data engineering infrastructure, you should check out this book: https://github.com/Nunie123/data_engineering_on_gcp_book
-
[HELP] Airflow Reverse proxy + load balancer +docker
If you want to try Airflow without the setup headache, you can try Composer on GCP, which is a hosted version of Airflow. I wrote some info on how to do that here: https://github.com/Nunie123/data_engineering_on_gcp_book/blob/master/ch_2_orchestration.md
-
Transition from a Quality engineer to Data engineer
This book might be a good resource for you: https://github.com/Nunie123/data_engineering_on_gcp_book
-
Accepted a data engineer intern role at a Big N company - how do I learn as much as possible?
If you want a place to start on personal projects you can check out this book, https://github.com/Nunie123/data_engineering_on_gcp_book, which will walk you through the basics of setting up a full data engineering stack.
-
What tools, software, programming languages, and etc. does a data engineer need to have in 2021
If you are interested in tooling, here's a free book on setting up a basic data engineering tech stack on GCP: https://github.com/Nunie123/data_engineering_on_gcp_book
What are some alternatives?
Video-Hub-App - Official repository for Video Hub App
shotcaller - A moddable RTS/MOBA game made with bracket-lib and minigene.
btfs - A bittorrent filesystem based on FUSE.
FactGraph - FactGraph monorepo (backend + frontend + landing page + blog)
check-all-the-things - check all of the things!
beubo - Beubo is a free, simple, and minimal CMS with unlimited extensibility using plugins
fusell-seed - FUSE (the low-level interface) file system boilerplate :open_file_folder: :electric_plug: :floppy_disk:
go-plugin - Golang plugin system over RPC.
VimMode.spoon - Adds vim keybindings to all OS X inputs
dali - Indie assembler/linker for Dalvik VM .dex & .apk files (Work In Progress)
electron-browser-shell - A minimal, tabbed web browser with support for Chrome extensions—built on Electron.