Jupyter Notebook Spark

Open-source Jupyter Notebook projects categorized as Spark | Edit details

Top 6 Jupyter Notebook Spark Projects

  • H2O

    H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

    Project mention: [PAID] Looking for Phaser.js game developer | reddit.com/r/INAT | 2021-12-09

    Built and founded various web3 projects for last 2 years such as OpenArt and 8RealmDojo for last 2 years as well as being high performing student in CTU in Prague and SeoulTech. Was offered internships in Amazon and H2O.ai. Created robots assistants using robots from SoftBank.

  • BigDL

    Building Large-Scale AI Applications for Distributed Big Data

    Project mention: Machine learning on JVM | reddit.com/r/scala | 2021-04-05

    Intel BigDL for Spark which again is for Spark.

  • OPS

    OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.

  • HELK

    The Hunting ELK

    Project mention: Home lab with security monitoring tools? | reddit.com/r/netsecstudents | 2021-09-23

    HELK can help for the SIEM and detection part

  • JustEnoughScalaForSpark

    A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.

    Project mention: Learning Spark Scala: I'm a medium Python Data Engineer with some experience in Java. I have to learn "enough" Scala to be at ease with Spark's Scala API. I have three weeks. Where should I start ? | reddit.com/r/scala | 2021-02-03

    There's literally something called, "Just enough Scala for Spark." https://github.com/deanwampler/JustEnoughScalaForSpark

  • project-atlas-sao-paulo

    A project for the development of rich geospatial data from the city of São Paulo for use in Machine Learning models.

    Project mention: Project Atlas - São Paulo - A project for the development of rich geospatial data from the city of São Paulo for use in Machine Learning models (really great for learning Pyspark) | reddit.com/r/gis | 2022-01-11
  • synapse-azure-data-explorer-101

    Getting started with Azure Synapse and Azure Data Explorer

    Project mention: Getting started with Azure Data Explorer and Azure Synapse Analytics for Big Data processing | dev.to | 2021-07-16

    Notebooks are available in this GitHub repo — https://github.com/abhirockzz/synapse-azure-data-explorer-101

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-01-11.

Jupyter Notebook Spark related posts


What are some of the best open-source Spark projects in Jupyter Notebook? This list will help you:

Project Stars
1 H2O 5,691
2 BigDL 3,824
3 HELK 3,133
4 JustEnoughScalaForSpark 616
5 project-atlas-sao-paulo 2
6 synapse-azure-data-explorer-101 1
Find remote jobs at our new job board 99remotejobs.com. There are 29 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
Less time debugging, more time building
Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.