Big Data Processing, EMR with Spark and Hadoop | Python, PySpark

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Apache Spark

101 38,378 10.0 Scala

Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark is an open-source, distributed processing system used for big data workloads. Wanna dig more dipper?

Apache Hadoop

26 14,316 9.9 Java

Apache Hadoop

Apache Hadoop is an open source framework that is used to efficiently store and process large datasets ranging in size from gigabytes to petabytes of data.Wanna dig more dipper?

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project