- pyspark-on-aws-emr VS docker-livy
- pyspark-on-aws-emr VS TypedPyspark
- pyspark-on-aws-emr VS demo-code
- pyspark-on-aws-emr VS Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data
- pyspark-on-aws-emr VS livyc
- pyspark-on-aws-emr VS IDEA
- pyspark-on-aws-emr VS pyspark-starter
- pyspark-on-aws-emr VS Dropout-Students-Prediction
- pyspark-on-aws-emr VS recommendation-system
- pyspark-on-aws-emr VS lithops
Pyspark-on-aws-emr Alternatives
Similar projects and alternatives to pyspark-on-aws-emr
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data
Mobile robot data were analyzed with Apache-Spark to extract five different statistical result such as travel time, waiting time, average speed, occupancy and density were produced.
-
Dropout-Students-Prediction
The goal of this project is to identify students at risk of dropping out the school
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.
-
wbz
A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler transform (BWT) and Move to front (MTF) to improve the Huffman compression. For now, this tool only will be focused on compressing .csv files, and other files on tabular format.
-
data-engineering-challenge-th
Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)
-
text-analysis-speeches-amlo
Text analysis of the speeches, conferences and interviews of the current president of Mexico
-
distance-metrics
Distance metrics are one of the most important parts of some machine learning algorithms, supervised and unsupervised learning, it will help us to calculate and measure similarities between numerical values expressed as data points
-
lithops
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
pyspark-on-aws-emr reviews and mentions
-
Data Engineering Projects for Beginners
Building Big Data Pipelines in the Cloud with AWS EMR
Stats
Wittline/pyspark-on-aws-emr is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of pyspark-on-aws-emr is Python.
Popular Comparisons
Sponsored