dask-gateway
spark-snowflake
Our great sponsors
dask-gateway | spark-snowflake | |
---|---|---|
4 | 1 | |
127 | 196 | |
0.8% | -0.5% | |
8.4 | 5.6 | |
8 days ago | 2 months ago | |
Python | Scala | |
BSD 3-clause "New" or "Revised" License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dask-gateway
- How to change the API version from v1alpha to v1 prior to upgrading the kubernetes cluster?
-
How can we change the API versions of kubernetes objects in GKE prior to cluster upgrade?
Those two resource types are using the traefik.containo.us/v1alpha1 API version, which itself is defined at https://github.com/dask/dask-gateway/blob/main/resources/helm/dask-gateway/crds/traefik.yaml, and doesn't use the deprecated CRD API.
-
Why Databricks Is Winning
I’ve had a lot of success with Dask lately. It’s comparable to spark in some ways [0]. Being written in python and built on top of pandas/numpy it allows much more flexibility. It also has great tools built on top of kubernetes making deployment quick and easy [1].
[0] https://docs.dask.org/en/latest/spark.html
[1] https://github.com/dask/dask-gateway
spark-snowflake
-
Why Databricks Is Winning
Snowflake and Databricks are different, sometimes complementary technologies. You can store data in Snowflake & query it with Databricks for example: https://github.com/snowflakedb/spark-snowflake
Snowflake predicate pushdown filtering seems quite promising: https://www.snowflake.com/blog/snowflake-spark-part-2-pushin...
Think both these companies can win.
What are some alternatives?
flintrock - A command-line tool for launching Apache Spark clusters.
databricks-nutter-repos-demo - Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline
kube-no-trouble - Easily check your clusters for use of deprecated APIs
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Apache Spark - Apache Spark - A unified analytics engine for large-scale data processing
chispa - PySpark test helper methods with beautiful error messages
delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs