SaaSHub helps you find the best software and product alternatives Learn more →
Top 11 Jupyter Notebook Spark Projects
-
Project mention: Magic: The Gathering dashboard | First complete DE project ever | Feedback welcome | reddit.com/r/dataengineering | 2023-03-23
I am fairly new to DE, learning Python since December 2022, and coming from a non-tech background. I took part in the DataTalksClub Zoomcamp. I started using these tools used in the project in January 2023.
-
H2O
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
H2O.ai
-
InfluxDB
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
-
-
Project mention: Kali Linux 2023.1 introduces 'Purple' distro for defensive security | reddit.com/r/netsec | 2023-03-14
Utilizing that api and juniper notebooks is exactly why Hunting Elk is the way it from my understanding.
-
JustEnoughScalaForSpark
A tutorial on the most important features and idioms of Scala that you need to use Spark's Scala APIs.
Project mention: Which tutorial to learn functional programming without going in depth ? | reddit.com/r/scala | 2023-02-09- https://github.com/deanwampler/JustEnoughScalaForSpark
-
Project mention: ✨ 5 Open Source Data Engineering Projects 🔥 | reddit.com/r/dataengineering | 2022-10-19
5️⃣ Data Engineering Projects
-
I've worked for 3-4 years in positions where I helped structure ETLs, DWs and alike. However, I'm now on the cusp of being hired to help structure the area in a big investment fund here, helping the research area have an easier time focusing on their models. My previous experience led me to grasp DBT, SQL, and most of my experience came from using a Microsoft stack with SSIS, Analysis Services and the like. I'm feeling wayyyy over my head to start building this, and the multitude of possible stacks make me very afraid that I might overengineer this, and I will initially be alone in the area. What do I do? Fake it till I make it? I never lied in my resume, so it's not like they expect a senior with plenty of experience but still... I read this: https://github.com/zsvoboda/ngods-stocks And it seems like a good starter, albeit overly complex for our use case. I could use suggestions, people to talk to, etc. Please help
-
SonarQube
Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.
-
-
amazon-emr-with-delta-lake
Amazon EMR Notebook to show how to read from and write to Delta tables with Amazon EMR
-
project-atlas-sao-paulo
A project for the development of rich geospatial data from the city of São Paulo for use in Machine Learning models.
-
Jupyter Notebook Spark related posts
- Magic: The Gathering dashboard | First complete DE project ever | Feedback welcome
- SNS/SQS vs Kafka
- Help needed for a direction
- How can I learn to build and manage data pipelines without any experience?
- Learn more on data engineering
- Shifting Career to Data Engineering
- Help getting on track
-
A note from our sponsor - #<SponsorshipServiceOld:0x00007f160ce67f98>
www.saashub.com | 23 Mar 2023
Index
What are some of the best open-source Spark projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | data-engineering-zoomcamp | 12,946 |
2 | H2O | 6,190 |
3 | BigDL | 4,175 |
4 | HELK | 3,429 |
5 | JustEnoughScalaForSpark | 660 |
6 | Data-Engineering-Projects | 392 |
7 | ngods-stocks | 256 |
8 | rikai | 128 |
9 | amazon-emr-with-delta-lake | 15 |
10 | project-atlas-sao-paulo | 7 |
11 | synapse-azure-data-explorer-101 | 3 |