streamify
finnhub-streaming-data-pipeline
streamify | finnhub-streaming-data-pipeline | |
---|---|---|
4 | 2 | |
474 | 248 | |
- | - | |
0.0 | 5.6 | |
about 2 years ago | 5 months ago | |
Python | HCL | |
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
streamify
- Where can I find online projects end-to-end?
-
Completed my first Data Engineering project with Kafka, Spark, GCP, Airflow, dbt, Terraform, Docker and more!
Here is link number 1 - Previous text "Git"
finnhub-streaming-data-pipeline
-
Reddit Sentiment Analysis Real-Time* Data Pipeline
I didn't use any specific guide. It was mostly build, test, integrate and repeat for each component. For some of them, I went through official documentation on getting started with each application and implemented it in the cluster. However, I reckon you can find other tutorials to setup each application by itself. A few github projects helped me in planning the project architecture and codebase structure like https://github.com/RSKriegs/finnhub-streaming-data-pipeline and https://gitlab.fit.cvut.cz/kozlovit/ni-dip-project-kozlovit.
- Where can I find online projects end-to-end?
What are some alternatives?
eventsim - Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.
Reddit-API-Pipeline
terraform - Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
surf_dash
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
reddit-streaming-pipeline - A real-time reddit data streaming pipeline for sentiment analysis of various subreddits
eventsim - Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.
data-engineering-zoomcamp - Free Data Engineering course!
tfl-bikes-data-pipeline - Processing TFL data for bike usage with Google Cloud Platform.
audiophile-e2e-pipeline - Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard.
spark-bigquery-connector - BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.