Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today. Learn more →
Top 23 Rust Machine Learning Projects
-
qdrant
Qdrant - Vector Database for the next generation of AI applications. Also available in the cloud https://cloud.qdrant.io/
There are plenty of options, but I'd suggest Qdrant on Docker: https://qdrant.tech/
-
-
SonarLint
Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.
-
Project mention: Have you ever wanted a library to check for 69 in a string? | /r/rustjerk | 2023-04-20
You can use Tensorflow for Rust to simplify that task and avoid pain with regex. Just have the right mindset.
-
postgresml
PostgresML is an AI application database. Download open source models from Huggingface, or train your own, to create and index LLM embeddings, generate text, or make online predictions using only SQL.
Project mention: Python SDK for PostgresML with scalable LLM embedding memory and text generation | news.ycombinator.com | 2023-06-02We've been working on a Python SDK[1] for PostgresML to make it easier for application developers to get the performance and scalability benefits of integrated memory for LLMs, by combining embedding generation, vector recall and LLM tasks from HuggingFace in a single database query.
This work builds on our previous efforts that give a 10x performance improvement from generating the LLM embedding[2] from input text along with tuning vector recall[3] in a single process to avoid excessive network transit.
We'd love your feedback on our roadmap[4] for this extension, if you have other use cases for an ML application database. So far, we've implemented our best practices for scalable vector storage to provide an example reference implementation for interacting with an ML application database based on Postgres.
[1]: https://github.com/postgresml/postgresml/tree/master/pgml-sd...
-
Project mention: llm: a Rust crate/CLI for CPU inference of LLMs, including LLaMA, GPT-NeoX, GPT-J and more | /r/rust | 2023-05-09
You could try looking at the min-GPT example of tch-rs. I'd also strongly suggest watching Karpathy's video to understand what's going on.
-
Project mention: Why is Rust not more popular in ML and secure edge computing? | /r/rust | 2022-11-13
-
hora
🚀 efficient approximate nearest neighbor search algorithm collections library written in Rust 🦀 .
Project mention: Building a Vector Database with Rust to Make Use of Vector Embeddings | /r/rust | 2023-06-01We have been playing around with Hora as a replacement for the Rust-CV implementation as we want PQ as well. I'll check out instanct-distance, looks very interesting!
-
InfluxDB
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
-
Here is the project: https://github.com/burn-rs/burn
-
rust-bert
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
I'd like to use this transformer model in rust (because it's on the backend, because I can use data munging and it will be faster, and for other reasons). It looks like a good model! But, it doesn't compile on Apple Silicon for wierd linking issues that aren't apparent - https://github.com/guillaume-be/rust-bert/issues/338. I've spent a large part of today and yesterday attempting to find out why. The only other library that I've found for doing this kind of thing programmatically (particularly sentiment analysis) is this (https://github.com/JohnSnowLabs/spark-nlp). Some of the models look a little older, which is OK, but it does mean that I'd have to do this in another language.
Does anyone know of any sentiment analysis software that can be tuned (other than VADER - I'm looking for more along the lines of a transformer model) - like BERT, but is pretrained and can be used in Rust or Python? Otherwise I'll probably using spark-nlp and having to spin another process.
Thanks.
-
lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
-
-
-
-
Any alternatives? I found this Rust based project that might be interesting: https://github.com/getmetal/motorhead
-
smartcore
A comprehensive library for machine learning and numerical computing. The library provides a set of tools for linear algebra, numerical computing, optimization, and enables a generic, powerful yet still efficient approach to machine learning.
Today we have released version 0.3 of smartcore: a comprehensive library for machine learning and numerical computing. The library provides a set of tools for linear algebra, numerical computing, optimization, and enables a generic, powerful yet still efficient approach to machine learning.
-
nlprule
A fast, low-resource Natural Language Processing and Text Correction library written in Rust.
Project mention: Language Tool – open-source Grammarly Alternative | news.ycombinator.com | 2022-07-26check out nlprule, it's LanguageTool alternative written in Rust
-
Project mention: Stream processing framework for a new project in Python | /r/dataengineering | 2023-06-02
Disclaimer: I work on Bytewax, but it feels like this could be a good fit and would save you some time looking around. If you need to do stateful operations (reduce, window, etc.) then you can use bytewax - https://github.com/bytewax/bytewax with pub/sub, but you would need to build a custom connector. There are some guides on how to do that - https://www.bytewax.io/blog/custom-input-connector.
-
Project mention: [D] Any options for using GPT models using proprietary data ? | /r/MachineLearning | 2023-04-02
We are working on an open-source project, BlindAI (https://github.com/mithril-security/blindai) to answer exactly that: privacy when sending data to remote AI models.
-
Project mention: Announcing dfdx - an deep learning library built with const generics | /r/rust | 2022-07-13
There's other differences in how nn layers are implemented if you compare the source of linear layers: https://github.com/coreylowman/dfdx/blob/main/src/nn/linear.rs vs https://github.com/c0dearm/mushin/blob/main/src/nn/layers/linear.rs
-
PERSIA
High performance distributed framework for training deep learning recommendation models based on PyTorch.
-
Project mention: This year I tried solving AoC using Rust, here are my impressions coming from Python! | /r/rust | 2023-01-02
Also http://arewelearningyet.com
-
NucliaDB https://github.com/nuclia/nucliadb
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Rust Machine Learning related posts
- Python SDK for PostgresML with scalable LLM embedding memory and text generation
- [P] Python SDK for PostgresML w/ scalable LLM embedding memory and text generation
- [Show HN] Lance is a Rust-based alternative to Parquet for ML data
- Show HN: Lance is a Rust-based alternative to Parquet for ML data
- dna_parser : A Python package written in Rust to encode DNA sequences for machine learning.
- Any Alternatives to Langchain?
- Show HN: We unified LLMs, vector memory, ranking, pruning models in one process
-
A note from our sponsor - SonarLint
www.sonarlint.org | 6 Jun 2023
Index
What are some of the best open-source Machine Learning projects in Rust? This list will help you:
Project | Stars | |
---|---|---|
1 | qdrant | 10,832 |
2 | leaf | 5,505 |
3 | rust | 4,474 |
4 | postgresml | 2,988 |
5 | tch-rs | 2,879 |
6 | linfa | 2,670 |
7 | hora | 2,375 |
8 | burn | 2,265 |
9 | rust-bert | 1,846 |
10 | lance | 1,685 |
11 | dfdx | 1,079 |
12 | juice | 1,038 |
13 | rustlearn | 576 |
14 | motorhead | 550 |
15 | smartcore | 532 |
16 | nlprule | 523 |
17 | bytewax | 473 |
18 | blindai | 430 |
19 | gamma | 371 |
20 | PERSIA | 358 |
21 | are-we-learning-yet | 326 |
22 | Nuclia DB | 299 |
23 | autograph | 237 |