Top 14 Jupyter Notebook Pyspark Projects
-
WallStreetBets_BigDataAnalysis
Research project aimed to classify the best stock research posts from r/WallStreetBets for you. ๐
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
pyspark-tutorial
PySpark Tutorial for Beginners - Practical Examples in Jupyter Notebook with Spark version 3.4.1. The tutorial covers various topics like Spark Introduction, Spark Installation, Spark RDD Transformations and Actions, Spark DataFrame, Spark SQL, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites. (by coder2j)
-
lasagna
A Docker Compose template that builds a interactive development environment for PySpark with Jupyter Lab, MinIO as object storage, Hive Metastore, Trino and Kafka
-
reddit-streaming
streaming eight subreddits from reddit api using kafka producer & spark structured streaming.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
pyspark_nlp_workshop
Instructions and code for the workshop "From Big Data to NLP Insights: Unlocking the Power of PySpark and Spark NLP"
-
project-atlas-sao-paulo
A project for the development of rich geospatial data from the city of Sรฃo Paulo for use in Machine Learning models.
-
workshop-introduction-to-machine-learning
Come ready to discover the goals and approaches of machine learning, and how to build effective algorithms and solutions!
-
project
Predict how many points an European football team will end the season with, according to the characteristics of its players. Project for the Big Data Computing course at Sapienza University of Rome (2021-22) (by Big-Data-FC)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Watch it now ๐ https://youtu.be/EB8lfdxpirM GitHub Repo ๐ https://github.com/coder2j/pyspark-tutorial
Project mention: PySpark for NLP workshop โ Jupyter notebooks and instructions | news.ycombinator.com | 2023-05-14
file-format-benchmark: benchmark script of key operations between different file formats
Jupyter Notebook Pyspark related posts
- [P] ESG scoring with Node2Vec and web-site with streamlit!
- PySpark: A brief analysis to the most common words in Dracula, by Bram Stoker
- I built an AI to classify good DD and bad DD, also shows the growth percentage of a stock associated with a post.
- Release John Snow Labs Spark-NLP 2.7.0: New T5 and MarianMT seq2seq transformers, detect up to 375 languages, word segmentation, over 720+ models and pipelines, support for 192+ languages, and many more! ยท JohnSnowLabs/spark-nlp
Index
What are some of the best open-source Pyspark projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | Gather-Deployment | 350 |
2 | WallStreetBets_BigDataAnalysis | 165 |
3 | anovos | 77 |
4 | pyspark-tutorial | 29 |
5 | lasagna | 27 |
6 | ESG-AI-investment-by-streamlit | 21 |
7 | reddit-streaming | 18 |
8 | pyspark_nlp_workshop | 12 |
9 | project-atlas-sao-paulo | 9 |
10 | workshop-introduction-to-machine-learning | 7 |
11 | project | 6 |
12 | synapse-azure-data-explorer-101 | 4 |
13 | file-format-benchmark | 2 |
14 | dracula | 0 |
Sponsored