SaaSHub helps you find the best software and product alternatives Learn more →
Top 9 Python apache-spark Projects
-
Here, we’ll implement the experimentation workflow using DagsHub, Google Colab, MLflow, and data version control (DVC). We’ll focus on how to do this without diving deep into the technicalities of building or designing a workbench from scratch. Going that route might increase the complexity involved, especially if you are in the early stages of understanding ML workflows, just working on a small project, or trying to implement a proof of concept.
-
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
Project mention: Invitation to collaborate on open source PySpark projects | reddit.com/r/apachespark | 2022-10-15
quinn is a library with PySpark helper functions. I need to work through all the open issues / PRs and bump all versions. I should do another release. This library gets around 600,000 monthly downloads.
-
-
-
Project mention: Useful Tools and Programs list for Apache Spark | reddit.com/r/dataengineering | 2022-03-20
-
Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data
Mobile robot data were analyzed with Apache-Spark to extract five different statistical result such as travel time, waiting time, average speed, occupancy and density were produced.
Project mention: Traffic Data Analysis with Apache Spark Based on Autonomous Transport Vehicle Data | dev.to | 2022-04-05You can access all project on my github repo.
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
Project mention: Wittline/livyc: Apache Livy Client | reddit.com/r/DataEngineeringLatam | 2022-06-30
-
I developed a function to do this! Check it out: https://github.com/kharigardner/Patek
Python apache-spark related posts
- AWS re:invent 2022 wish list
- Invitation to collaborate on open source PySpark projects
- [D] Who here are convinced that they have a really good setup that keeps track of their ML experiments?
- How often do you use MLflow for your computer vision models?
- Keeping Your Machine Learning Models on the Right Track: Getting Started with MLflow, Part 2
- [RFP] Product idea for BYOD data science platform
- mlflow: Open source platform for the machine learning lifecycle
-
A note from our sponsor - #<SponsorshipServiceOld:0x00007fea597fa058>
www.saashub.com | 2 Feb 2023
Index
What are some of the best open-source apache-spark projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | MLflow | 13,535 |
2 | flintrock | 616 |
3 | quinn | 392 |
4 | PySpark-Boilerplate | 382 |
5 | sparktorch | 286 |
6 | Apache-Spark-Guide | 14 |
7 | Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data | 9 |
8 | livyc | 2 |
9 | Patek | 0 |