popmon
koalas
popmon | koalas | |
---|---|---|
1 | 2 | |
486 | 3,319 | |
0.6% | 0.2% | |
6.9 | 4.6 | |
3 months ago | about 1 month ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
popmon
koalas
-
My new company uses Pyspark. I want to learn it before my starting date. Any advice?
If they're using databricks and you're familiar with pandas, koalas should be right up your alley .
-
Spark vs Pandas
If you like excessive use of square brackets.. I mean pandas, you might wanna check out Koalas. Koalas suppose to provide pandas datafrafe API implementation atop of Spark.
What are some alternatives?
sweetviz - Visualize and compare datasets, target values and associations, with one line of code.
Dask - Parallel computing with task scheduling
cape-dataframes - Privacy transformations on Spark and Pandas dataframes backed by a simple policy language.
datacompy - Pandas and Spark DataFrame comparison for humans and more!
ydata-profiling - 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
PandasGUI - A GUI for Pandas DataFrames
lifetimes - Lifetime value in Python
fastdbfs - fastdbfs - An interactive command line client for Databricks DBFS.
Optimus - :truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.