Top 23 Scientific Computing Open-Source Projects

SciPy

50 12,407 9.9 Python

SciPy library main repository

Project mention: What Is a Schur Decomposition? | news.ycombinator.com | 2024-03-04

I guess it is a rite of passage to rewrite it. I'm doing it for SciPy too together with Propack in [1]. Somebody already mentioned your repo. Thank you for your efforts.
[1]: https://github.com/scipy/scipy/issues/18566

Torch

2 8,891 0.0 C

http://torch.ch
InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
gop

23 8,772 9.8 Go

The Go+ programming language is designed for engineering, STEM education, and data science

Project mention: Go Enums Suck | news.ycombinator.com | 2024-03-01

https://github.com/goplus/gop, but they go slightly too overboard imo.

gonum

24 7,249 8.2 Go

Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more

Project mention: How to set up interface to accept multi-dimension array? | /r/golang | 2023-07-13

But if you want to see what can be done for numeric stuff, check out gonum. Personally, I still wouldn't use Go, and I rather suspect it's still pretty easy to reach for something like what you're trying to do and not find it because Go just can't write that type sensibly, but you can at least see what is available, written by people who disagree with me about Go not being a great language for this.

burn

7 6,948 9.8 Rust

Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.

Project mention: Transitioning From PyTorch to Burn | dev.to | 2024-02-14

[package] name = "resnet_burn" version = "0.1.0" edition = "2021" [dependencies] burn = { git = "https://github.com/tracel-ai/burn.git", rev = "75cb5b6d5633c1c6092cf5046419da75e7f74b11", features = ["ndarray"] } burn-import = { git = "https://github.com/tracel-ai/burn.git", rev = "75cb5b6d5633c1c6092cf5046419da75e7f74b11" } image = { version = "0.24.7", features = ["png", "jpeg"] }

mlpack

4 4,787 9.9 C++

mlpack: a fast, header-only C++ machine learning library

Project mention: How much C++ is used when it comes to performing quant research? | /r/quant | 2023-07-03

Does C++ have the equivalent of Pandas or Apache Spark? Are there extensive libraries that exist/are being developed that allow you to perform operations with data? Or do people just use a combination of Python & its various libraries (NumPy etc)? If we leave aside the data bit, are there libraries that allow you to develop ML models in C++ (mlpack for instance ) faster & more efficiently compared to their Python counterparts (scikit-learn)? On a more general note, how does C++ fit into the routine of a Quant Researcher? And at what scale does an organization decide they need to start switching to other languages and spend more time developing the code ?

ArrayFire

6 4,395 7.8 C++

ArrayFire: a general purpose GPU library.

Project mention: Learn WebGPU | news.ycombinator.com | 2023-04-27

Loads of people have stated why easy GPU interfaces are difficult to create, but we solve many difficult things all the time.
Ultimately I think CPUs are just satisfactory for the vast vast majority of workloads. Servers rarely come with any GPUs to speak of. The ecosystem around GPUs is unattractive. CPUs have SIMD instructions that can help. There are so many reasons not to use GPUs. By the time anyone seriously considers using GPUs they're, in my imagination, typically seriously starved for performance, and looking to control as much of the execution details as possible. GPU programmers don't want an automagic solution.
So I think the demand for easy GPU interfaces is just very weak, and therefore no effort has taken off. The amount of work needed to make it as easy to use as CPUs is massive, and the only reason anyone would even attempt to take this on is to lock you in to expensive hardware (see CUDA).
For a practical suggestion, have you taken a look at https://arrayfire.com/ ? It can run on both CUDA and OpenCL, and it has C++, Rust and Python bindings.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
stdlib

9 3,988 10.0 JavaScript

✨ Standard library for JavaScript and Node.js. ✨

Project mention: Node still seems better than python after all this time for web server speed but.. | /r/node | 2023-06-20

Numpy is a library - node.js has plenty of them, what is missing? There is stdlib package that offers optimized math functions, for example.

spack

52 3,938 10.0 Python

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.

Project mention: Autodafe: "freeing your freeing your project from the clammy grip of autotools." | news.ycombinator.com | 2024-04-06

> Are we talking about the same autotools?
Yes. Instead of figuring out how to do something particular with every single software package, I can do a --with-foo or --without-bar or --prefix=/opt/baz-1.2.3, and be fairly confident that it will work the way I want.
Certainly with package managers or (FreeBSD) Ports a lot is taken care of behind the scenes, but the above would also help the package/port maintainers as well. Lately I've been using Spack for special-needs compiles, but maintainer ease also helps there, but there are still cases one a 'fully manual' compile is still done.
> Suffice it to say, I prefer to work with handwritten makefiles.
Having everyone 'roll their own' system would probably be worse, because any "mysteriously failure" then has to be debugged specially for each project.
Have you tried Spack?
* https://spack.io
* https://spack.readthedocs.io/en/latest/

matplotplusplus

26 3,909 6.5 C++

Matplot++: A C++ Graphics Library for Data Visualization 📊🗾

Project mention: Creating k-NN with C++ (from Scratch) | dev.to | 2024-01-11

cmake_minimum_required(VERSION 3.5) project(knn_cpp CXX) # Set up C++ version and properties include(CheckIncludeFileCXX) check_include_file_cxx(any HAS_ANY) check_include_file_cxx(string_view HAS_STRING_VIEW) check_include_file_cxx(coroutine HAS_COROUTINE) set(CMAKE_CXX_STANDARD 20) set(CMAKE_BUILD_TYPE Debug) set(CMAKE_CXX_STANDARD_REQUIRED ON) set(CMAKE_CXX_EXTENSIONS OFF) # Copy data file to build directory file(COPY ${CMAKE_CURRENT_SOURCE_DIR}/iris.data DESTINATION ${CMAKE_CURRENT_BINARY_DIR}) # Download library usinng FetchContent include(FetchContent) FetchContent_Declare(matplotplusplus GIT_REPOSITORY https://github.com/alandefreitas/matplotplusplus GIT_TAG origin/master) FetchContent_GetProperties(matplotplusplus) if(NOT matplotplusplus_POPULATED) FetchContent_Populate(matplotplusplus) add_subdirectory(${matplotplusplus_SOURCE_DIR} ${matplotplusplus_BINARY_DIR} EXCLUDE_FROM_ALL) endif() FetchContent_Declare( fmt GIT_REPOSITORY https://github.com/fmtlib/fmt.git GIT_TAG 7.1.3 # Adjust the version as needed ) FetchContent_MakeAvailable(fmt) # Add executable and link project libraries and folders add_executable(${PROJECT_NAME} main.cc) target_link_libraries(${PROJECT_NAME} PUBLIC matplot fmt::fmt) aux_source_directory(lib LIB_SRC) target_include_directories(${PROJECT_NAME} PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}) target_sources(${PROJECT_NAME} PRIVATE ${LIB_SRC}) add_subdirectory(tests)

linfa

14 3,381 6.3 Rust

A Rust machine learning framework.
NumCpp

4 3,370 6.4 C++

C++ implementation of the Python Numpy library
rust-ndarray

20 3,307 8.1 Rust

ndarray: an N-dimensional array with array views, multidimensional slicing, and efficient operations

Project mention: Some Reasons to Avoid Cython | news.ycombinator.com | 2023-09-22

I would love some examples of how to do non-trivial data interop between Rust and Python. My experience is that PyO3/Maturin is excellent when converting between simple datatypes but conversions get difficult when there are non-standard types, e.g. Python Numpy arrays or Rust ndarrays or whatever other custom thing.
Polars seems to have a good model where it uses the Arrow in memory format, which has implementations in Python and Rust, and makes a lot of the ndarray stuff easier. However, if the Rust libraries are not written with Arrow first, they become quite hard to work with. For example, there are many libraries written with https://github.com/rust-ndarray/ndarray, which is challenging to interop with Numpy.
(I am not an expert at all, please correct me if my characterizations are wrong!)

FluidX3D

53 3,162 8.6 C++

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs via OpenCL.

Project mention: FluidX3D | news.ycombinator.com | 2024-03-24

FFTW

3 2,578 4.2 C

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
boinc

212 1,915 9.6 PHP

Open-source software for volunteer computing and grid computing.

Project mention: Distributed Inference and Fine-Tuning of Large Language Models over the Internet | news.ycombinator.com | 2024-01-02

Made me think of Gridcoin and BOINC https://boinc.berkeley.edu/

thread-pool

6 1,911 4.2 C++

BS::thread_pool: a fast, lightweight, and easy-to-use C++17 thread pool library
gosl

0 1,804 6.0 Go

Linear algebra, eigenvalues, FFT, Bessel, elliptic, orthogonal polys, geometry, NURBS, numerical quadrature, 3D transfinite interpolation, random numbers, Mersenne twister, probability distributions, optimisation, differential equations.
TileDB

12 1,762 9.7 C++

The Universal Storage Engine

Project mention: Ask HN: Who is hiring? (September 2023) | news.ycombinator.com | 2023-09-01

- single cell genomics: in collaboration with the Chan-Zuckerberg Initiative, we recently released TileDB-SOMA for single cell data, with APIs for both Python and R built around a common storage specification: https://tiledb.com/blog/tiledb-101-single-cell
With TileDB, all data — tables, genomics, images, videos, location, time-series — across multiple domains is captured as multi-dimensional arrays. TileDB Cloud implements a totally serverless infrastructure and delivers access control, easier data and code sharing and distributed computing at global scale, eliminating cluster management, minimizing TCO and promoting scientific collaboration and reproducibility.
Website: https://tiledb.com
GitHub: https://github.com/TileDB-Inc/TileDB

PyCUDA

0 1,740 5.4 Python

CUDA integration for Python, plus shiny features
casadi

4 1,549 9.3 C++

CasADi is a symbolic framework for numeric optimization implementing automatic differentiation in forward and reverse modes on sparse matrix-valued computational graphs. It supports self-contained C-code generation and interfaces state-of-the-art codes such as SUNDIALS, IPOPT etc. It can be used from C++, Python or Matlab/Octave.

Project mention: pyomo VS casadi - a user suggested alternative | libhunt.com/r/pyomo | 2023-09-05

Interface for several solvers and integrators.

mfem

7 1,530 9.9 C++

Lightweight, general, scalable C++ library for finite element methods
ruptures

1 1,467 5.7 Python

ruptures: change point detection in Python
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2024-04-06.

Scientific Computing related posts

Go Enums Suck
1 project | news.ycombinator.com | 1 Mar 2024
Burn Deep Learning Framework Release 0.12.0 Improved API and PyTorch Integration
1 project | news.ycombinator.com | 31 Jan 2024
Fix: Hong Kong is not in China
1 project | news.ycombinator.com | 23 Jan 2024
Supercharge Web AI Model Testing: WebGPU, WebGL, and Headless Chrome
2 projects | news.ycombinator.com | 16 Jan 2024
Burn Deep Learning Framework 0.11.0 Released: Just-in-Time Automatic Kernel Fusion & Founding Announcement
1 project | /r/rust | 3 Dec 2023
Burn Deep Learning Framework v0.11.0 Released: Just-in-Time Kernel Fusion
1 project | news.ycombinator.com | 1 Dec 2023
Burn – comprehensive dynamic Deep Learning Framework built using Rust
1 project | news.ycombinator.com | 23 Nov 2023
A note from our sponsor - SaaSHub
www.saashub.com | 19 Apr 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Scientific Computing projects? This list will help you:

	Project	Stars
1	SciPy	12,407
2	Torch	8,891
3	gop	8,772
4	gonum	7,249
5	burn	6,948
6	mlpack	4,787
7	ArrayFire	4,395
8	stdlib	3,988
9	spack	3,938
10	matplotplusplus	3,909
11	linfa	3,381
12	NumCpp	3,370
13	rust-ndarray	3,307
14	FluidX3D	3,162
15	FFTW	2,578
16	boinc	1,915
17	thread-pool	1,911
18	gosl	1,804
19	TileDB	1,762
20	PyCUDA	1,740
21	casadi	1,549
22	mfem	1,530
23	ruptures	1,467