R

Top 23 R Open-Source Projects

  1. ML-For-Beginners

    12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

    Project mention: Learn Machine Learning with these GitHub repositories | news.ycombinator.com | 2025-01-15

    *Learn Machine Learning with these amazing GitHub repositories! *

    1⃣ [ML for Beginners](https://github.com/microsoft/ML-For-Beginners) by Microsoft

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

    Project mention: Unveiling the Apache License 2.0: A Deep Dive into Open Source Freedom | dev.to | 2025-03-11

    One of the key attributes of Apache License 2.0 is its flexible nature. Permitting use in both proprietary and open source environments, it has become the go-to choice for innovative projects ranging from the Apache HTTP Server to large-scale initiatives like Apache Spark and Hadoop. This flexibility is not solely legal; it is also philosophical. The license is designed to encourage transparency and maintain a healthy balance between freedom and accountability, ultimately making it easier for developers to adapt and contribute without restrictive legal barriers. Another modern twist discussed in the article is the concept of dual licensing. Dual licensing can offer an attractive method for additional commercial exploitation while still upholding open source principles. However, as the article cautions, dual licensing involves legal intricacy and demands rigor in managing Contributor License Agreements (CLAs), a challenge that the open source community navigates with ongoing debates. For developers looking to understand similar innovative approaches to licensing, further information can be explored at License Token.

  4. Prophet

    Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

    Project mention: TimesFM (Time Series Foundation Model) for time-series forecasting | news.ycombinator.com | 2024-05-08
  5. LightGBM

    A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

  6. ds-cheatsheets

    List of Data Science Cheatsheets to rule the world

  7. mal

    mal - Make a Lisp

    Project mention: Remaking a rule-engine DSL | dev.to | 2024-11-17

    So this time I needed to tokenize, and perform the lexer on my own. If I only deal with numbers, everything is easy, but when it comes to string things get more complicated. I followed another tutorial, and rediscovered make-a-lisp project. Eventually I gave up, and used the lexer provided by hy-lang.

  8. metaflow

    Build, Deploy and Manage AI/ML Systems

    Project mention: Show HN: Flow – A Dynamic Task Engine for AI Agents Without DAG | news.ycombinator.com | 2024-12-02

    Interesting! I feel like this is a cross between https://github.com/dagworks-inc/burr (switch state for context) and https://github.com/Netflix/metaflow because the output of the "task" declares its next hop...

  9. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  10. hugo-blox-builder

    🚨 GROW YOUR AUDIENCE WITH HUGOBLOX! 🚀 HugoBlox is an easy, fast no-code website builder for researchers, entrepreneurs, data scientists, and developers. Build stunning sites in minutes. 适合研究人员、企业家、数据科学家和开发者的简单快速无代码网站构建器。用拖放功能、可定制模板和内置SEO工具快速创建精美网站!

  11. catboost

    A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

    Project mention: 🚀 Why Your ML Service Needs Rust + CatBoost: A Setup Guide That Actually Works | dev.to | 2025-01-19

    [package] name = "MLApp" version = "0.1.0" edition = "2021" [dependencies] catboost = { git = "https://github.com/catboost/catboost", rev = "0bfdc35"}

  12. H2O

    H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  13. ggplot2

    An implementation of the Grammar of Graphics in R

    Project mention: Debugging Compiled Code for R with Positron | news.ycombinator.com | 2024-10-29

    Pardon me for shooting from the hip here, but IMO if you're using R for something radically different than statistical analysis and data visualization, there might be another tool/language that's more purpose-suited.

    > As someone who basically uses R as a nice LISP-y scripting language to orchestrate calling low-level compiled code from other languages

    When I read this, I think, would `bash` or something equally portable/universally installed work?

    R is a beautiful thing when limited to its core uses (I use it every day ([0]). But in my experience, the more we build away from those core uses, the more brittleness we introduce. I wish the Posit team would focus on the core R experience, resolve some of the hundreds of open issues on its core packages in a timely way, [1,2] and just generally play to R's strengths.

    [0] https://github.com/hsflabstanford/vegan-meta

    [1] https://github.com/rstudio/rmarkdown/issues

    [2] https://github.com/tidyverse/ggplot2/issues

  14. FriendsDontLetFriends

    Friends don't let friends make certain types of data visualization - What are they and why are they bad.

  15. awesome-R

    A curated list of awesome R packages, frameworks and software.

  16. papermill

    📚 Parameterize, execute, and analyze notebooks

    Project mention: Jupyter Notebooks as E2E Tests | news.ycombinator.com | 2024-12-18
  17. dplyr

    dplyr: A grammar of data manipulation

    Project mention: 1MinDocker #6 - Building further | dev.to | 2024-11-11

    dplyr

  18. r4ds

    R for data science: a book

    Project mention: Visualizing Data on a Mesh with Displacement Mapping in R | news.ycombinator.com | 2024-06-18

    My personal favorite resource is "R for Data Science" by Hadley Wickham. It covers lots of nice data manipulation and visualization examples, and provides a good introduction to the tidyverse, which is a particular dialect of R that's well-suited for data analysis. It's available for free at:

    https://r4ds.hadley.nz/

    For more specialized analytical methods there are lots of textbooks out there that provide a deep dive into packages for a specific field (e.g. survival analysis, machine learning, time series), but for general data manipulation and visualization it's hard to beat R4DS.

  19. wave

    Realtime Web Apps and Dashboards for Python and R (by h2oai)

    Project mention: This Week In Python | dev.to | 2024-08-30

    wave – Realtime Web Apps and Dashboards for Python and R

  20. ML-Workspace

    🛠 All-in-one web-based IDE specialized for machine learning and data science.

  21. Data-science-best-resources

    Carefully curated resource links for data science in one place

  22. rmarkdown

    Dynamic Documents for R

    Project mention: Reinventing notebooks as reusable Python programs | news.ycombinator.com | 2025-03-19

    I am surprised they didn't mention RMarkdown (https://rmarkdown.rstudio.com/), which was developed in parallel to Jupyter Notebooks, with lots of convergent evolution.

    RMarkdown is essentially Markdown with executable code blocks. While it comes from an R background, code blocks can be written in any language (and you can mix multiple languages).

    The biggest difference (and, I would say, advantage) is that it separates code from output, making it work well with version control.

  23. DifferentialEquations.jl

    Multi-language suite for high-performance solvers of differential equations and scientific machine learning (SciML) components. Ordinary differential equations (ODEs), stochastic differential equations (SDEs), delay differential equations (DDEs), differential-algebraic equations (DAEs), and more in Julia.

    Project mention: Modelica | news.ycombinator.com | 2024-12-16

    Another up-and-coming solution is Julia's simulation ecosystem [1]. It is powered by the commercial organization behind the Julia programming language, which has received DARPA funding [2] to build out these tools. This ecosystem unifies researchers in numerical methods [3], scalable compute, and domain experts in modeling engineering systems (electrical, mechanical, etc.) I believe this is where simulation is headed.

    [1] https://juliahub.com/products/juliasim

    [2] https://news.ycombinator.com/item?id=26425659

    [3] https://docs.sciml.ai/DiffEqDocs/stable/

  24. m2cgen

    Transform ML models into a native code (Java, C, Python, Go, JavaScript, Visual Basic, C#, R, PowerShell, PHP, Dart, Haskell, Ruby, F#, Rust) with zero dependencies

  25. AIF360

    A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

    Project mention: The Cornerstones of Ethical Software Development: Privacy, Transparency, Fairness, Security, and Accountability | dev.to | 2024-06-26

    IBM AI Fairness 360 Toolkit

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

R discussion

Log in or Post with

R related posts

  • The Application of Java Programming In Data Analysis and Artificial Intelligence

    1 project | dev.to | 10 Mar 2025
  • Apache Spark: Revolutionizing Big Data with Sustainable Open Source Funding

    1 project | dev.to | 6 Mar 2025
  • echarts4r VS echarty - a user suggested alternative

    2 projects | 3 Feb 2025
  • R package for collaborative writing of R Markdown documents in Google Docs

    1 project | news.ycombinator.com | 30 Jan 2025
  • Run PySpark Local Python Windows Notebook

    2 projects | dev.to | 21 Jan 2025
  • Infraestrutura para análise de dados com Jupyter, Cassandra, Pyspark e Docker

    2 projects | dev.to | 15 Jan 2025
  • Introducing Rlinguo, a native mobile app that runs R

    3 projects | dev.to | 23 Dec 2024
  • A note from our sponsor - CodeRabbit
    coderabbit.ai | 25 Mar 2025
    Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →

Index

What are some of the best open-source R projects? This list will help you:

# Project Stars
1 ML-For-Beginners 71,493
2 Apache Spark 40,785
3 Prophet 18,997
4 LightGBM 17,057
5 ds-cheatsheets 14,981
6 mal 10,220
7 metaflow 8,645
8 hugo-blox-builder 8,535
9 catboost 8,300
10 H2O 7,072
11 ggplot2 6,639
12 FriendsDontLetFriends 6,632
13 awesome-R 6,119
14 papermill 6,116
15 dplyr 4,846
16 r4ds 4,723
17 wave 4,069
18 ML-Workspace 3,478
19 Data-science-best-resources 3,024
20 rmarkdown 2,918
21 DifferentialEquations.jl 2,921
22 m2cgen 2,865
23 AIF360 2,546

Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai

Did you know that Python is
the 2nd most popular programming language
based on number of references?