LoopVectorization.jl
PackageCompiler.jl
LoopVectorization.jl | PackageCompiler.jl | |
---|---|---|
10 | 26 | |
722 | 1,371 | |
0.6% | 0.5% | |
7.0 | 7.8 | |
5 days ago | 6 days ago | |
Julia | Julia | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LoopVectorization.jl
-
Mojo – a new programming language for all AI developers
It is a little disappointing that they're setting the bar against vanilla Python in their comparisons. While I'm sure they have put massive engineering effort into their ML compiler, the demos they showed of matmul are not that impressive in an absolute sense; with the analogous Julia code, making use of [LoopVectorization.jl](https://github.com/JuliaSIMD/LoopVectorization.jl) to automatically choose good defaults for vectorization, etc...
```
- Knight’s Landing: Atom with AVX-512
-
Python 3.11 is 25% faster than 3.10 on average
> My mistake in retrospect was using small arrays as part of a struct, which being immutable got replaced at each time step with a new struct requiring new arrays to be allocated and initialized. I would not have done that in c++, but julia puts my brain in matlab mode.
I see. Yes, it's an interesting design space where Julia makes both heap and stack allocations easy enough, so sometimes you just reach for the heap like in MATLAB mode. Hopefully Prem and Shuhei's work lands soon enough to stack allocate small non-escaping arrays so that user's done need to think about this.
> Alignment I'd assumed, but padding the struct instead of the tuple did nothing, so probably extra work to clear a piece of an simd load. Any insight on why avx availability didn't help would be appreciated. I did verify some avx instructions were in the asm it generated, so it knew, it just didn't use.
The major differences at this point seem to come down to GCC (g++) vs LLVM and proofs of aliasing. LLVM's auto-vectorizer isn't that great, and it seems to be able to prove 2 arrays are not aliasing less reliably. For the first part, some people have just improved the loop analysis code from the Julia side (https://github.com/JuliaSIMD/LoopVectorization.jl), forcing SIMD onto LLVM can help it make the right choices. But for the second part you do need to do `@simd ivdep for ...` (or use LoopVectorization.jl) to match some C++ examples. This is hopefully one of the things that the JET.jl and other new analysis passes can help with, along with the new effects system (see https://github.com/JuliaLang/julia/pull/43852, this is a pretty huge new compiler feature in v1.8, but right now it's manually specified and will take time before things like https://github.com/JuliaLang/julia/pull/44822 land and start to make it more pervasive). When that's all together, LLVM will have more ammo for proving things more effectively (pun intended).
-
Vectorize function calls
This looks nice too. Seems to be maintained and it even has a vmap-function. What more can one ask for ;) https://github.com/JuliaSIMD/LoopVectorization.jl
-
Implementing dedispersion in Julia.
Have you checked out https://github.com/JuliaSIMD/LoopVectorization.jl ? It may be useful for your specific use case
-
We Use Julia, 10 Years Later
And the "how" behind Octavian.jl is basically LoopVectorization.jl [1], which helps make optimal use of your CPU's SIMD instructions.
Currently there can some nontrivial compilation latency with this approach, but since LV ultimately emits custom LLVM it's actually perfectly compatible with StaticCompiler.jl [2] following Mason's rewrite, so stay tuned on that front.
[1] https://github.com/JuliaSIMD/LoopVectorization.jl
[2] https://github.com/tshort/StaticCompiler.jl
-
Why Lisp? (2015)
Yes, and sorry if I also came off as combative here, it was not my intention either. I've used some Common Lisp before I got into Julia (though I never got super proficient with it) and I think it's an excellent language and it's too bad it doesn't get more attention.
I just wanted to share what I think is cool about julia from a metaprogramming point of view, which I think is actually its greatest strength.
> here is a hypothetical question that can be asked: would a julia programmer be more powerful if llvm was written in julia? i think the answer is clear that they would be
Sure, I'd agree it'd be great if LLVM was written in julia. However, I also don't think it's a very high priority because there are all sorts of ways to basically slap LLVM's hands out of the way and say "no, I'll just do this part myself."
E.g. consider LoopVectorization.jl [1] which is doing some very advanced program transformations that would normally be done at the LLVM (or lower) level. This package is written in pure Julia and is all about bypassing LLVM's pipelines and creating hyper efficient microkernels that are competitive with the handwritten assembly in BLAS systems.
To your point, yes Chris' life likely would have been easier here if LLVM was written in julia, but also he managed to create this with a lot less man-power in a lot less time than anything like it that I know of, and it's screaming fast so I don't think it was such a huge impediment for him that LLVM wasn't implemented in julia.
[1] https://github.com/JuliaSIMD/LoopVectorization.jl
-
A Project of One’s Own
He still holds a few land speed records he set with motorcycles he designed and built.
But I had no real hobbies or passions of my own, other than playing card games.
It wasn't until my twenties, after I already graduated college with degrees I wasn't interested in and my dad's health failed, that I first tried programming. A decade earlier, my dad was attending the local Linux meetings when away from his machine shop.
Programming, and especially performance optimization/loop vectorization are now my passion and consume most of my free time (https://github.com/JuliaSIMD/LoopVectorization.jl).
Hearing all the stories about people starting and getting hooked when they were 11 makes me feel like I lost a dozen years of my life. I had every opportunity, but just didn't take them. If I had children, I would worry for them.
-
When porting numpy code to Julia, is it worth it to keep the code vectorized?
Julia will often do SIMD under the hood with either a for loop or a broadcasted version, so you generally shouldn't have to worry about it. But for more advanced cases you can look at https://github.com/JuliaSIMD/LoopVectorization.jl
-
Julia 1.6 Highlights
Very often benchmarks include compilation time of julia, which might be slow. Sometimes they rightfully do so, but often it's really apples and oranges when benchmarking vs C/C++/Rust/Fortran. Julia 1.6 shows compilation time in the `@time f()` macro, but Julia programmers typically use @btime from the BenchmarkTools package to get better timings (e.g. median runtime over n function calls).
I think it's more interesting to see what people do with the language instead of focusing on microbenchmarks. There's for instance this great package https://github.com/JuliaSIMD/LoopVectorization.jl which exports a simple macro `@avx` which you can stick to loops to vectorize them in ways better than the compiler (=LLVM). It's quite remarkable you can implement this in the language as a package as opposed to having LLVM improve or the julia compiler team figure this out.
See the docs which kinda read like blog posts: https://juliasimd.github.io/LoopVectorization.jl/stable/
PackageCompiler.jl
-
Potential of the Julia programming language for high energy physics computing
Yes, julia can be called from other languages rather easily, Julia functions can be exposed and called with a C-like ABI [1], and then there's also various packages for languages like Python [2] or R [3] to call Julia code.
With PackageCompiler.jl [4] you can even make AOT compiled standalone binaries, though these are rather large. They've shrunk a fair amount in recent releases, but they're still a lot of low hanging fruit to make the compiled binaries smaller, and some manual work you can do like removing LLVM and filtering stdlibs when they're not needed.
Work is also happening on a more stable / mature system that acts like StaticCompiler.jl [5] except provided by the base language and people who are more experienced in the compiler (i.e. not a janky prototype)
[1] https://docs.julialang.org/en/v1/manual/embedding/
[2] https://pypi.org/project/juliacall/
[3] https://www.rdocumentation.org/packages/JuliaCall/
[4] https://github.com/JuliaLang/PackageCompiler.jl
[5] https://github.com/tshort/StaticCompiler.jl
- Strong arrows: a new approach to gradual typing
-
Making Python 100x faster with less than 100 lines of Rust
One of Julia's Achilles heels is standalone, ahead-of-time compilation. Technically this is already possible [1], [2], but there are quite a few limitations when doing this (e.g. "Hello world" is 150 MB [7]) and it's not an easy or natural process.
The immature AoT capabilities are a huge pain to deal with when writing large code packages or even when trying to make command line applications. Things have to be recompiled each time the Julia runtime is shut down. The current strategy in the community to get around this seems to be "keep the REPL alive as long as possible" [3][4][5][6], but this isn't a viable option for all use cases.
Until Julia has better AoT compilation support, it's going to be very difficult to develop large scale programs with it. Version 1.9 has better support for caching compiled code, but I really wish there were better options for AoT compiling small, static, standalone executables and libraries.
[1]: https://julialang.github.io/PackageCompiler.jl/dev/
-
What's Julia's biggest weakness?
Doesn’t work on Windows, but https://github.com/JuliaLang/PackageCompiler.jl does.
-
I learned 7 programming languages so you don't have to
Also, you can precompile a whole package and just ship the binary. We do this all of the time.
https://github.com/JuliaLang/PackageCompiler.jl
And getting things precompiled: https://sciml.ai/news/2022/09/21/compile_time/
-
Julia performance, startup.jl, and sysimages
You can have a look at PackageCompiler.jl
-
Why Julia 2.0 isn’t coming anytime soon (and why that is a good thing)
I think by PackageManager here you mean package compiler, and yes these improvements do not need a 2.0. v1.8 included a few things to in the near future allow for building binaries without big dependencies like LLVM, and finishing this work is indeed slated for the v1.x releases. Saying "we are not doing a 2.0" is precisely saying that this is more important than things which change the user-facing language semantics.
And TTFP does need to be addressed. It's a current shortcoming of the compiler that native and LLVM code is not cached during the precompilation stages. If such code is able to precompile into binaries, then startup time would be dramatically decreased because then a lot of package code would no longer have to JIT compile. Tim Holy and Valentin Churavy gave a nice talk at JuliaCon 2022 about the current progress of making this work: https://www.youtube.com/watch?v=GnsONc9DYg0 .
This is all tied up with startup time and are all in some sense the same issue. Currently, the only way to get LLVM code cached, and thus startup time essentially eliminated, is to build it into what's called the "system image". That system image is the binary that package compiler builds (https://github.com/JuliaLang/PackageCompiler.jl). Julia then ships with a default system image that includes the standard library in order to remove the major chunk of code that "most" libraries share, which is why all of Julia Base works without JIT lag. However, that means everyone wants to have their thing, be it sparse matrices to statistics, in the standard library so that it gets the JIT-lag free build by default. This means the system image is huge, which is why PackageCompiler, which is simply a system for building binaries by appending package code to the system image, builds big binaries. What needs to happen is for packages to be able to precompile in a way that then caches LLVM and native code. Then there's no major compile time advantage to being in the system image, which will allow things to be pulled out of the system image to have a leaner Julia Base build without major drawbacks, which would then help make the system compile. That will then make it so that an LLVM and BLAS build does not have to be in every binary (which is what takes up most of the space and RAM), which would then allow Julia to much more comfortably move beyond the niche of scientific computing.
- Is it possible to create a Python package with Julia and publish it on PyPi?
- GenieFramework – Web Development with Julia
-
Julia for health physics/radiation detection
You're probably dancing around the edges of what [PackageCompiler.jl](https://github.com/JuliaLang/PackageCompiler.jl) is capable of targeting. There are a few new capabilities coming online, namely [separating codegen from runtime](https://github.com/JuliaLang/julia/pull/41936) and [compiling small static binaries](https://github.com/tshort/StaticCompiler.jl), but you're likely to hit some snags on the bleeding edge.
What are some alternatives?
CUDA.jl - CUDA programming in Julia.
StaticCompiler.jl - Compiles Julia code to a standalone library (experimental)
julia - The Julia Programming Language
Genie.jl - 🧞The highly productive Julia web framework
cl-cuda - Cl-cuda is a library to use NVIDIA CUDA in Common Lisp programs.
LuaJIT - Mirror of the LuaJIT git repository
julia-vim - Vim support for Julia.
Dash.jl - Dash for Julia - A Julia interface to the Dash ecosystem for creating analytic web applications in Julia. No JavaScript required.
cmu-infix - Updated infix.cl of the CMU AI repository, originally written by Mark Kantrowitz
Transformers.jl - Julia Implementation of Transformer models