Top 3 Nim Openmp Projects
-
Arraymancer
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
-
weave
A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead (by mratsim)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
laser
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers (by mratsim)
It is a small DSL written using macros at https://github.com/mratsim/Arraymancer/blob/master/src/array....
Nim has pretty great meta-programming capabilities and arraymancer employs some cool features like emitting cuda-kernels on the fly using standard templates depending on backend !
Project mention: The GIL can now be disabled in Python's main branch | news.ycombinator.com | 2024-03-11
It depends.
You need 2~3 accumulators to saturate instruction-level parallelism with a parallel sum reduction. But the compiler won't do it because it only creates those when the operation is associative, i.e. (a+b)+c = a+(b+c), which is true for integers but not for floats.
There is an escape hatch in -ffast-math.
I have extensive benches on this here: https://github.com/mratsim/laser/blob/master/benchmarks%2Ffp...
Nim Openmp related posts
Index
What are some of the best open-source Openmp projects in Nim? This list will help you:
Project | Stars | |
---|---|---|
1 | Arraymancer | 1,309 |
2 | weave | 524 |
3 | laser | 261 |
Sponsored