clojask
geni
clojask | geni | |
---|---|---|
5 | 4 | |
113 | 278 | |
- | 1.8% | |
4.2 | 5.6 | |
9 months ago | 6 months ago | |
Clojure | Clojure | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
clojask
-
Data-recur meeting 2: general monthly - updates about Clojask, ds4clj, and more
Among other things, we will have a brief intro to Clojask - a library for parallel computing of larger-than-memory datasets developed at HKU Business School.
-
Question about data engineer in clojure
You can give Clojask a try, it's designed for larger-than-memory datasets. https://github.com/clojure-finance/clojask
-
A data science course for Clojurians – are you interested?
You could give Clojask a try. If you need to read from different file types other than .csv, you can also use the Clojask "plug-in" called clojask-io
- Clojask – data processing with parallel computing on larger-than-memory datasets
-
Clojask: A parallel data processing framework that is designed for large datasets
Clojask is a data processing framework that is designed for large datasets, inspired by Dask, Spark and NoSQL databases.
geni
-
Spark Anyone?
sparkling is fine. there is also geni
- LLVM!
-
Scala is a Maintenance Nightmare
I haven't tried Spark from Kotlin, but it's a nice experience working with it in Clojure, and I have yet to see a language more expressive than Clojure. :)
-
Data engineering and Clojure?
I think for the large scale stuff, wrappers like geni are pretty nice and built on top of established tech. There were several distributed computing platforms like onyx and storm that popped up in clojure as well that may be interesting to look at. clojure toolbox has a good index of libraries to examine.
What are some alternatives?
cascalog - Data processing on Hadoop without the hassle.
tech.ml.dataset - A Clojure high performance data processing system
jackdaw - A Clojure library for the Apache Kafka distributed streaming platform.
tablecloth - Dataset manipulation library built on the top of tech.ml.dataset
holy-lambda - The extraordinary simple, performant, and extensible custom AWS Lambda runtime for Clojure.
clojask-io - Reading and writing various file formats for Clojask: clojask-io is a library designed to extend the file support for Clojask. This library can also be used alone to read in and output dataset files.
kotlinx.collections.immutable - Immutable persistent collections for Kotlin
notespace - using your namespace as a notebook
frovedis - Framework of vectorized and distributed data analytics
deep-diamond - A fast Clojure Tensor & Deep Learning library
geni-performance-benchmark