Julia Statistics

Open-source Julia projects categorized as Statistics

Top 15 Julia Statistic Projects

  • MLJ.jl

    A Julia machine learning framework

  • Distributions.jl

    A Julia package for probability distributions and associated functions.

    Project mention: Yann Lecun: ML would have advanced if other lang had been adopted versus Python | news.ycombinator.com | 2023-02-22

    If you look at Julia open source projects you'll see that the projects tend to have a lot more contributors than the Python counterparts, even over smaller time periods. A package for defining statistical distributions has had 202 contributors (https://github.com/JuliaStats/Distributions.jl), etc. Julia Base even has had over 1,300 contributors (https://github.com/JuliaLang/julia) which is quite a lot for a core language, and that's mostly because the majority of the core is in Julia itself.

    This is one of the things that was noted quite a bit at this SIAM CSE conference, that Julia development tends to have a lot more code reuse than other ecosystems like Python. For example, the various machine learning libraries like Flux.jl and Lux.jl share a lot of layer intrinsics in NNlib.jl (https://github.com/FluxML/NNlib.jl), the same GPU libraries (https://github.com/JuliaGPU/CUDA.jl), the same automatic differentiation library (https://github.com/FluxML/Zygote.jl), and of course the same JIT compiler (Julia itself). These two libraries are far enough apart that people say "Flux is to PyTorch as Lux is to JAX/flax", but while in the Python world those share almost 0 code or implementation, in the Julia world they share >90% of the core internals but have different higher levels APIs.

    If one hasn't participated in this space it's a bit hard to fathom how much code reuse goes on and how that is influenced by the design of multiple dispatch. This is one of the reasons there is so much cohesion in the community since it doesn't matter if one person is an ecologist and the other is a financial engineer, you may both be contributing to the same library like Distances.jl just adding a distance function which is then used in thousands of places. With the Python ecosystem you tend to have a lot more "megapackages", PyTorch, SciPy, etc. where the barrier to entry is generally a lot higher (and sometimes requires handling the build systems, fun times). But in the Julia ecosystem you have a lot of core development happening in "small" but central libraries, like Distances.jl or Distributions.jl, which are simple enough for an undergrad to get productive in a week but is then used everywhere (Distributions.jl for example is used in every statistics package, and definitions of prior distributions for Turing.jl's probabilistic programming language, etc.).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • StatsWithJuliaBook

    Project mention: An Introduction to Statistical Learning with Applications in Python | news.ycombinator.com | 2023-07-09

    I actually like this book by Yoni Nazarathy


    They have a book on Mathematics of DL too which is a natural progression from the concepts covered here.

    (I am slightly biased towards this since I've known the author by online interactions)

  • OnlineStats.jl

    ⚡ Single-pass algorithms for statistics

  • GLM.jl

    Generalized linear models in Julia

  • StatsBase.jl

    Basic statistics for Julia

  • GeoStats.jl

    An extensible framework for geospatial data science and geostatistical modeling fully written in Julia

  • LearnThisRepo.com

    Learn 300+ open source libraries for free using AI. LearnThisRepo lets you learn 300+ open source repos including Postgres, Langchain, VS Code, and more by chatting with them using AI!

  • MixedModels.jl

    A Julia package for fitting (statistical) mixed-effects models

  • HypothesisTests.jl

    Hypothesis tests for Julia

  • ScientificTypes.jl

    An API for dispatching on the "scientific" type of data instead of the machine type

  • ARCHModels.jl

    A Julia package for estimating ARMA-GARCH models.

  • MarSwitching.jl

    MarSwitching.jl: Julia package for Markov switching dynamic models

    Project mention: MarSwitching.jl: New package for Markov Switching regression models | /r/Julia | 2023-10-04

    You can read more and check example in the package repo here: https://github.com/m-dadej/MarSwitching.jl

  • StatsAPI.jl

    A statistics-focused namespace for packages to share functions

  • NumericalAlgorithms.jl

    [DEPRECATED] Statistics & Numerical algorithms implemented in Julia.

  • biomisc_julia

    collection of miscellaneous command line bioinformatic scripts written in julia for speed

  • WorkOS

    The modern API for authentication & user identity. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-10-04.

Julia Statistics related posts


What are some of the best open-source Statistic projects in Julia? This list will help you:

Project Stars
1 MLJ.jl 1,700
2 Distributions.jl 1,057
3 StatsWithJuliaBook 1,056
4 OnlineStats.jl 810
5 GLM.jl 565
6 StatsBase.jl 559
7 GeoStats.jl 478
8 MixedModels.jl 393
9 HypothesisTests.jl 285
10 ScientificTypes.jl 92
11 ARCHModels.jl 87
12 MarSwitching.jl 29
13 StatsAPI.jl 17
14 NumericalAlgorithms.jl 12
15 biomisc_julia 1
The modern API for authentication & user identity.
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.