Get realtime insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in realtime with unbounded cardinality. Learn more →
Top 15 Julia Statistic Projects


Project mention: Yann Lecun: ML would have advanced if other lang had been adopted versus Python  news.ycombinator.com  20230222
If you look at Julia open source projects you'll see that the projects tend to have a lot more contributors than the Python counterparts, even over smaller time periods. A package for defining statistical distributions has had 202 contributors (https://github.com/JuliaStats/Distributions.jl), etc. Julia Base even has had over 1,300 contributors (https://github.com/JuliaLang/julia) which is quite a lot for a core language, and that's mostly because the majority of the core is in Julia itself.
This is one of the things that was noted quite a bit at this SIAM CSE conference, that Julia development tends to have a lot more code reuse than other ecosystems like Python. For example, the various machine learning libraries like Flux.jl and Lux.jl share a lot of layer intrinsics in NNlib.jl (https://github.com/FluxML/NNlib.jl), the same GPU libraries (https://github.com/JuliaGPU/CUDA.jl), the same automatic differentiation library (https://github.com/FluxML/Zygote.jl), and of course the same JIT compiler (Julia itself). These two libraries are far enough apart that people say "Flux is to PyTorch as Lux is to JAX/flax", but while in the Python world those share almost 0 code or implementation, in the Julia world they share >90% of the core internals but have different higher levels APIs.
If one hasn't participated in this space it's a bit hard to fathom how much code reuse goes on and how that is influenced by the design of multiple dispatch. This is one of the reasons there is so much cohesion in the community since it doesn't matter if one person is an ecologist and the other is a financial engineer, you may both be contributing to the same library like Distances.jl just adding a distance function which is then used in thousands of places. With the Python ecosystem you tend to have a lot more "megapackages", PyTorch, SciPy, etc. where the barrier to entry is generally a lot higher (and sometimes requires handling the build systems, fun times). But in the Julia ecosystem you have a lot of core development happening in "small" but central libraries, like Distances.jl or Distributions.jl, which are simple enough for an undergrad to get productive in a week but is then used everywhere (Distributions.jl for example is used in every statistics package, and definitions of prior distributions for Turing.jl's probabilistic programming language, etc.).

InfluxDB
Power RealTime Data Analytics at Scale. Get realtime insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in realtime with unbounded cardinality.

Project mention: An Introduction to Statistical Learning with Applications in Python  news.ycombinator.com  20230709
I actually like this book by Yoni Nazarathy
https://statisticswithjulia.org/
They have a book on Mathematics of DL too which is a natural progression from the concepts covered here.
(I am slightly biased towards this since I've known the author by online interactions)




GeoStats.jl
An extensible framework for geospatial data science and geostatistical modeling fully written in Julia

LearnThisRepo.com
Learn 300+ open source libraries for free using AI. LearnThisRepo lets you learn 300+ open source repos including Postgres, Langchain, VS Code, and more by chatting with them using AI!



ScientificTypes.jl
An API for dispatching on the "scientific" type of data instead of the machine type


Project mention: MarSwitching.jl: New package for Markov Switching regression models  /r/Julia  20231004
You can read more and check example in the package repo here: https://github.com/mdadej/MarSwitching.jl



biomisc_julia
collection of miscellaneous command line bioinformatic scripts written in julia for speed

WorkOS
The modern API for authentication & user identity. The APIs are flexible and easytouse, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Julia Statistics related posts
 Downloading packages to Julia 0.7
 What is the Julia equivalent of ScikitLearn?
 Recommended SelfStudy Statistics Book
 Don't waste your time on Julia
 R user excited about Julia
 Julia ranks in the top most loved programming languages for 2022
 Julia  linear mixed models

A note from our sponsor  InfluxDB
www.influxdata.com  25 Feb 2024
Index
What are some of the best opensource Statistic projects in Julia? This list will help you:
Project  Stars  

1  MLJ.jl  1,700 
2  Distributions.jl  1,057 
3  StatsWithJuliaBook  1,056 
4  OnlineStats.jl  810 
5  GLM.jl  565 
6  StatsBase.jl  559 
7  GeoStats.jl  478 
8  MixedModels.jl  393 
9  HypothesisTests.jl  285 
10  ScientificTypes.jl  92 
11  ARCHModels.jl  87 
12  MarSwitching.jl  29 
13  StatsAPI.jl  17 
14  NumericalAlgorithms.jl  12 
15  biomisc_julia  1 