data_origination_workshop
awesome-spark
data_origination_workshop | awesome-spark | |
---|---|---|
1 | 1 | |
11 | 1,617 | |
- | 0.9% | |
6.3 | 1.0 | |
about 2 months ago | 25 days ago | |
Shell | Shell | |
- | Creative Commons Zero v1.0 Universal |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data_origination_workshop
awesome-spark
-
What are your favorite Apache Spark open source libraries?
The awesome-spark repo has a list of Spark OSS libraries, but a lot of them are quite old.
What are some alternatives?
csv-import - The open-source CSV importer, maintained by @tableflowhq
papers-we-love - Papers from the computer science community to read and discuss.
DeepStream-dGPU-Installation - This repository is helpful for installing DeepStream SDK and it's python bindings in dGPU machine.
docker-spark - Apache Spark docker image
quix-streams - A Python library for building containerized ML and Generative AI applications with Apache Kafka.
talksheet - A GPT powered CLI tool that answers questions about your data
qr-code - A no-framework, no-dependencies, customizable, animate-able, SVG-based <qr-code> HTML element.
flasho - Open source customer notifications in less than 5 minutes
unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
memphis-cli-old - Memphis is an event processing platform
openai-python - The official Python library for the OpenAI API
wasmer - 🚀 The leading Wasm Runtime supporting WASIX, WASI and Emscripten