fastdbfs
koalas
Our great sponsors
fastdbfs | koalas | |
---|---|---|
1 | 2 | |
4 | 3,319 | |
- | 0.2% | |
0.0 | 4.6 | |
almost 3 years ago | about 1 month ago | |
Python | Python | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
fastdbfs
-
fastdbfs - An interactive command line client for Databricks DBFS
fastdbfs is an interactive command line client for accessing Databricks DBFS. It aims to be much more friendly and faster than the official CLI tool and also feature rich.
koalas
-
My new company uses Pyspark. I want to learn it before my starting date. Any advice?
If they're using databricks and you're familiar with pandas, koalas should be right up your alley .
-
Spark vs Pandas
If you like excessive use of square brackets.. I mean pandas, you might wanna check out Koalas. Koalas suppose to provide pandas datafrafe API implementation atop of Spark.
What are some alternatives?
dbx - 🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
Dask - Parallel computing with task scheduling
nutter - Testing framework for Databricks notebooks
datacompy - Pandas and Spark DataFrame comparison for humans and more!
Redash - Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
PandasGUI - A GUI for Pandas DataFrames
popmon - Monitor the stability of a Pandas or Spark dataframe ⚙︎
cape-dataframes - Privacy transformations on Spark and Pandas dataframes backed by a simple policy language.
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.