Scala MapReduce

Open-source Scala projects categorized as MapReduce

Scala MapReduce Projects

  1. Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

    Project mention: Every Database Will Support Iceberg — Here's Why | dev.to | 2025-04-22

    Apache Iceberg defines a table format that separates how data is stored from how data is queried. Any engine that implements the Iceberg integration — Spark, Flink, Trino, DuckDB, Snowflake, RisingWave — can read and/or write Iceberg data directly.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Scala MapReduce discussion

Log in or Post with

Scala MapReduce related posts

  • How to Reduce Big Data Analytics Costs by 90% with Karpenter and Spark

    3 projects | dev.to | 21 Apr 2025
  • Unveiling the Apache License 2.0: A Deep Dive into Open Source Freedom

    3 projects | dev.to | 11 Mar 2025
  • The Application of Java Programming In Data Analysis and Artificial Intelligence

    1 project | dev.to | 10 Mar 2025
  • Apache Spark: Revolutionizing Big Data with Sustainable Open Source Funding

    1 project | dev.to | 6 Mar 2025
  • Run PySpark Local Python Windows Notebook

    2 projects | dev.to | 21 Jan 2025
  • Infraestrutura para análise de dados com Jupyter, Cassandra, Pyspark e Docker

    2 projects | dev.to | 15 Jan 2025
  • His Startup Is Now Worth $62B. It Gave Away Its First Product Free

    1 project | news.ycombinator.com | 17 Dec 2024
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 19 May 2025
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →

Index

# Project Stars
1 Apache Spark 41,117

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Scala is
the 32nd most popular programming language
based on number of references?