Rust Simd

Open-source Rust projects categorized as Simd

Top 22 Rust Simd Projects

  • hora

    🚀 efficient approximate nearest neighbor search algorithm collections library written in Rust 🦀 .

    Project mention: Building a Vector Database with Rust to Make Use of Vector Embeddings | /r/rust | 2023-06-01

    We have been playing around with Hora as a replacement for the Rust-CV implementation as we want PQ as well. I'll check out instanct-distance, looks very interesting!

  • uwu

    fastest text uwuifier in the west

    Project mention: I’ve fallen in love with rust so now what? | /r/programmingcirclejerk | 2023-05-23

    that’s webshit shit. real rust programmers use uwu to uwufy text BLAZINGLY FAST 🚀🚀🚀🔥🔥🔥

  • SonarLint

    Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.

  • glam-rs

    A simple and fast linear algebra library for games and graphics

    Project mention: Generic modules? (like for different float types: f32, f64, arbitary rational, fixed point float, etc) | /r/rust | 2022-11-11
  • cgmath-rs

    A linear algebra and mathematics library for computer graphics.

    Project mention: Hey Rustaceans! Got a question? Ask here! (31/2022)! | /r/rust | 2022-08-02

    Take a look into math libraries, like glam, nalgebra, and cgmath. I've only used these through game engines, though, so I can't offer per-basis reviews/advice.

  • simd-json

    Rust port of simdjson

    Project mention: fn please_compile_with_a_simd_compatible_cpu_setting_read_the_simdjsonrs_readme() -> ! {} | /r/rustjerk | 2023-01-07
  • stdarch

    Rust's standard library vendor-specific APIs and run-time feature detection

    Project mention: Detecting SIMD support on ARM with Android (and patching the Rust compiler for it) | /r/rust | 2022-11-10

    Good to know! How would you compare it with the std_detect implementation, underlying the standard library (https://github.com/rust-lang/stdarch/tree/master/crates/std_detect)?

  • rust-memchr

    Optimized string search routines for Rust.

    Project mention: Sneller Regex vs Ripgrep | news.ycombinator.com | 2023-05-18

    And that is the primary reason why ripgrep doesn't bother with AVX-512. Not because of some lack of skill as this blog suggests:

    > Additionally, ripgrep uses AVX2 and does not take advantage of AVX-512 instruction sets, but this can be forgiven given the specialized skills required for handcrafting for SkylakeX and Icelake/Zen4 processors.

    Namely, I tried running sneller on my CPU, which is a pretty recent i9-12900K, and not even it supports AVX-512. That's because Intel has been dropping support for AVX-512 from its more recent consumer grade CPUs. ripgrep is running far more frequently on consumer grade CPUs, so supporting AVX-512 is probably not particularly advantageous. At least, it's not obvious to me that it's worth doing. And certainly, the skill argument isn't entirely wrong. I'd have to invest developer time to make it work.

    I think there are two other things worth highlighting from this blog.

    First is that sneller seems to do quite well with compressed data. This is definitely not ripgrep's strong suit. When you use ripgrep's -z/--search-zip flag, all it's doing is shelling out to your gzip/xz/whatever executable to do the decompression work, which is then streamed into ripgrep for searching. So if your search speed tanks when using -z/--search-zip, it's likely because your decompression tools are slow, not because of ripgrep. But it's a fair comparison from sneller's perspective, because it seems to integrate the two.

    Second is the issue of multi-threaded search. In ripgrep, the fundamental unit of work is "search a file." ripgrep has no support for more granular parallelism. That is, if you give it one file, it's limited to doing a single threaded search. ripgrep could do more granular parallelism, but it hasn't been obviously worth it to me. If most searches are on a directory tree, then parallelizing at the level of each file is almost certainly good enough. Making ripgrep's parallelism more fine grained is a fair bit of work too, and there would be a lot of fiddly stuff to get right. If I could run sneller easily, I'd probably try to see how it does in a more varied workload than what is presented in this blog. :-)

    And finally some corrections:

    > However, when using a single thread, ripgrep appears to be slightly faster.

    Not just slightly faster, over 2x faster!

    The single threaded results for Regex2 and Regex3 for Sneller are quite nice! I'd be interested in hearing more about what you're doing in the Regex2 case, since Sneller and ripgrep are about on par with the Regex3 case. Maybe a fail fast optimization?

    > The reason for this is that ripgrep uses the Boyer-Moore string search algorithm, which is a pattern matching algorithm that is often used for searching for substrings within larger strings. It is particularly efficient when the pattern being searched for is relatively long and the alphabet of characters being searched over is relatively small. Sneller does not use this substring search algorithm and as a result is slower than ripgrep with substrings. However, when long substrings are not present, Sneller outperforms ripgrep.

    ripgrep has never used Boyer-Moore. (Okay, some years ago, ripgrep could use Boyer-Moore in certain niche cases. But that hasn't been the case for a while and it was never the thing most commonly used). What ripgrep uses today is succinctly described here: https://github.com/BurntSushi/memchr#algorithms-used (But it has always eschewed algorithms like Boyer-Moore in favor of more heuristic-y approaches based on a background frequency distribution of bytes.)

    I think I would also contest the claim that "long substrings" are the key here. ripgrep is plenty fast with short substrings too. You're correct that if you have no literals then ripgrep will get slower because it has to fall back to the regex engine. But I'd like to see more robust benchmarks there. Your Regex2 and Regex3 benchmarks raise more questions than it answers. :-)

    > Although the resulting .dot and .svg files may be somewhat clunky, we can still observe from the graph that the number of nodes and edges are small enough to use the branchless IceLake implementation. In this particular case, we only need 8 bits to encode the number of nodes and the number of distinct edges, enabling the tool to use (what we call) the 8-bit DFA implementation. For more details on how this works, see our post on regex implementations.

    So this is talking about the DFA graph for the regex `Sherlock [A-Z]\w+`. It's important to point out that, in ripgrep, `\w` is Unicode aware by default. Which makes it absolutely enormous. So I think the state graph you linked is probably only for the ASCII version of that regex.

    Indeed, reading your regex blog[1], it perhaps looks like a lot of the tricks you use won't work for Unicode, because Unicode tends to blow up finite automata.

    If I could run Sneller, I'd probably try to poke it to see what its Unicode support looks like. From a quick glance of the source code, it also looks like you build full DFAs. So I would also try to poke it to see what happens when handed a particularly a not-so-small regex. (Building a DFA can take quite some time.)

    Ah okay, I see, you put a max limit on the DFA: https://github.com/SnellerInc/sneller/blob/bb5adec564bf9869d...

    Overall this is a very cool project!

    [1]: https://sneller.io/blog/accelerating-regex-using-avx-512/

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • simdutf8

    SIMD-accelerated UTF-8 validation for Rust.

  • multiversion

    Easy function multiversioning for Rust

  • thermite

    Thermite SIMD: Melt your CPU

  • wide

    A crate to help you go wide. By which I mean use SIMD stuff. (by Lokathor)

  • neural-network-from-scratch

    A neural network library written from scratch in Rust along with a web-based application for building + training neural networks + visualizing their outputs

    Project mention: Examine individual neurons of a small neural network in the browser | news.ycombinator.com | 2023-05-10
  • tsdownsample

    High-performance time series downsampling algorithms for visualization

    Project mention: downsampling 500M datapoints in < 0.05s | /r/Python | 2023-01-27
  • sliceslice-rs

    A fast implementation of single-pattern substring search using SIMD acceleration.

  • varint-simd

    Decoding and encoding gigabytes of LEB128 variable-length integers per second in Rust with SIMD

  • simd-alphatensor-rs

    🧮 alphatensor matrix breakthrough algorithms + simd + rust.

    Project mention: Got bored and implemented the AlphaTensor matrix multiplication algorithms in Rust with SIMD https://github.com/drbh/simd-alphatensor-rs | /r/rust | 2022-10-09

    thanks for the feedback u/lebensterben. Please see https://github.com/drbh/simd-alphatensor-rs#benchmarks for the results and https://github.com/drbh/simd-alphatensor-rs/blob/main/benches/my_benchmark/main.rs for the actual tests. Overall the results are promising - however, benches should always be taken with a grain of salt. I hope this is helpful!

  • faster-hex

    fast hex

  • argminmax

    Efficient argmin & argmax

    Project mention: [P] tsdownsample: extremely fast time series downsampling for visualization | /r/MachineLearning | 2023-01-24

    Fast: leverages the optimized argminmax crate which is SIMD accelerated with runtime feature detection (matches or even outperforms numpy's speed)

  • amx-rs

    Rust wrapper for Apple Matrix Coprocessor (AMX) instructions

    Project mention: Apple AMX instruction set (M1/M2 matrix coprocessor) | news.ycombinator.com | 2022-09-05

    Slightly cursed is how I roll. I would like to know where the trick first originated from though (I found it at https://github.com/yvt/amx-rs/blob/main/src/nativeops.rs#L22 rather than inventing it de novo)

  • bitsvec

    A bit vector with the Rust standard library's portable SIMD API.

  • simd-adler32

    A SIMD-accelerated Adler-32 hash algorithm implementation.

  • spaceform

    A cross-platform SIMD-accelerated math library for 3D rendering and simulation

    Project mention: Got bored and implemented the AlphaTensor matrix multiplication algorithms in Rust with SIMD https://github.com/drbh/simd-alphatensor-rs | /r/rust | 2022-10-09
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-06-01.

Rust Simd related posts

Index

What are some of the best open-source Simd projects in Rust? This list will help you:

Project Stars
1 hora 2,375
2 uwu 1,221
3 glam-rs 1,116
4 cgmath-rs 1,042
5 simd-json 835
6 stdarch 550
7 rust-memchr 512
8 simdutf8 469
9 multiversion 158
10 thermite 155
11 wide 137
12 neural-network-from-scratch 98
13 tsdownsample 86
14 sliceslice-rs 70
15 varint-simd 67
16 simd-alphatensor-rs 56
17 faster-hex 55
18 argminmax 38
19 amx-rs 33
20 bitsvec 31
21 simd-adler32 25
22 spaceform 5
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com