Scala Python

Open-source Scala projects categorized as Python

Top 11 Scala Python Projects

  1. Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

    Project mention: Gravitino - the unified metadata lake | dev.to | 2025-08-11

    In the meantime, other query engine support is on the roadmap, including Apache Spark, Apache Flink, and others.

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. Mill

    Mill is a build tool for Java, Scala and Kotlin: 3-6x faster than Maven or Gradle, less fiddling with plugins, and more easily explorable in your IDE

    Project mention: Ask HN: What are you working on? (May 2025) | news.ycombinator.com | 2025-05-25

    Working on my Mill build tool, aiming to bring a modern developer experience to the JVM ecosystem:

    - https://mill-build.org

    Build tools are generally an un-sexy field, and JVM build tools perhaps doubly so. But Mill demonstrates that with some thought put into the design and architecture, we can speed up JVM development workflows by 3-6x over traditional JVM tools like Maven or Gradle, and make it subjectively much easier to navigate in IDEs and extend with custom logic.

    If you're passionate about developer experience and work on the JVM, I encourage you to give Mill a try!

  4. mleap

    MLeap: Deploy ML Pipelines to Production

  5. Cortex

    Cortex: a Powerful Observable Analysis and Active Response Engine (by TheHive-Project)

  6. adam

    ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

  7. sparkMeasure

    This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.

    Project mention: SparkMeasure is a tool for performance troubleshooting of Apache Spark jobs | news.ycombinator.com | 2025-05-08
  8. scalapy

    Use the world of Python from the comfort of Scala!

  9. Sevalla

    Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

    Sevalla logo
  10. Vyxal

    A code-golfing language experience that has aspects of traditional programming languages - terse yet convenient.

  11. spark-extension

    A library that provides useful extensions to Apache Spark and PySpark.

  12. kukulcan

    A REPL for Apache Kafka

  13. stasis

    Backup and recovery system with emphasis on security and privacy (by sndnv)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Scala Python discussion

Log in or Post with

Scala Python related posts

  • SparkMeasure is a tool for performance troubleshooting of Apache Spark jobs

    1 project | news.ycombinator.com | 8 May 2025
  • How to Reduce Big Data Analytics Costs by 90% with Karpenter and Spark

    3 projects | dev.to | 21 Apr 2025
  • Apache Spark VS cocoindex - a user suggested alternative

    2 projects | 1 Apr 2025
  • The Application of Java Programming In Data Analysis and Artificial Intelligence

    1 project | dev.to | 10 Mar 2025
  • Apache Spark: Revolutionizing Big Data with Sustainable Open Source Funding

    1 project | dev.to | 6 Mar 2025
  • Run PySpark Local Python Windows Notebook

    2 projects | dev.to | 21 Jan 2025
  • Infraestrutura para análise de dados com Jupyter, Cassandra, Pyspark e Docker

    2 projects | dev.to | 15 Jan 2025
  • A note from our sponsor - Sevalla
    sevalla.com | 1 Sep 2025
    Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more! Learn more →

Index

What are some of the best open-source Python projects in Scala? This list will help you:

# Project Stars
1 Apache Spark 41,789
2 Mill 2,565
3 mleap 1,521
4 Cortex 1,464
5 adam 1,033
6 sparkMeasure 784
7 scalapy 570
8 Vyxal 290
9 spark-extension 229
10 kukulcan 116
11 stasis 108

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com