ispc vs highway

ispc

Intel® Implicit SPMD Program Compiler (by ispc)

Source Code

Suggest alternative

Edit details

highway

Performance-portable, length-agnostic SIMD with runtime dispatch (by google)

Simd simd-instructions simd-programming intrinsics Avx2 Avx512 Neon WASM Avx avx-512 avx-instructions sse42 simd-library simd-parallelism simd-intrinsics

Source Code

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

ispc		highway
	Project
4	Mentions	66
2,402	Stars	3,623
1.0%	Growth	3.3%
9.5	Activity	9.8
3 days ago	Latest Commit	5 days ago
C++	Language	C++
BSD 3-clause "New" or "Revised" License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

ispc

Posts with mentions or reviews of ispc. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-14.

Implementing a GPU's Programming Model on a CPU
2 projects | news.ycombinator.com | 14 Oct 2023

This so-called GPU programming model has existed many decades before the appearance of the first GPUs, but at that time the compilers were not so good like the CUDA compilers, so the burden for a programmer was greater.
As another poster has already mentioned, there exists a compiler for CPUs which has been inspired by CUDA and which has been available for many years: ISPC (Implicit SPMD Program Compiler), at https://github.com/ispc/ispc .
NVIDIA has the very annoying habit of using a lot of terms that are different from those that have been previously used in computer science for decades. The worst is that NVIDIA has not invented new words, but they have frequently reused words that have been widely used with other meanings.
SIMT (Single-Instruction Multiple Thread) is not the worst term coined by NVIDIA, but there was no need for yet another acronym. For instance they could have used SPMD (Single Program, Multiple Data Stream), which dates from 1988, two decades before CUDA.
Moreover, SIMT is the same thing that was called "array of processes" by C.A.R. Hoare in August 1978 (in "Communicating Sequential Processes"), or "replicated parallel" by Occam in 1985 or "PARALLEL DO" by "OpenMP Fortran" in 1997-10 or "parallel for" by "OpenMP C and C++" in 1998-10.
The only (but extremely important) innovation brought by CUDA is that the compiler is smart enough so that the programmer does not need to know the structure of the processor, i.e. how many cores it has and how many SIMD lanes has each core. The CUDA compiler distributes automatically the work over the available SIMD lanes and available cores and in most cases the programmer does not care whether two executions of the function that must be executed for each data item are done on two different cores or on two different SIMD lanes of the same core.
SIMD intrinsics and the possibility of a standard library solution
16 projects | /r/cpp | 8 Jan 2023

ISPC: https://github.com/ispc/ispc
Prefix Sum with SIMD
2 projects | news.ycombinator.com | 12 Feb 2022
Have you looked at [ISPC - Intel SPMD Program Compiler][0]?
```
  [0]: https://github.com/ispc/ispc
```
Duff’s Device in 2021
3 projects | news.ycombinator.com | 18 Nov 2021

highway

Posts with mentions or reviews of highway. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-31.

Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4
3 projects | news.ycombinator.com | 31 Mar 2024

The bf16 dot instruction replaces 6 instructions: https://github.com/google/highway/blob/master/hwy/ops/x86_12...
JPEG XL and the Pareto Front
9 projects | news.ycombinator.com | 1 Mar 2024

[0] for those interested in Highway.
It's also mentioned in [1], which starts off
> Today we're sharing open source code that can sort arrays of numbers about ten times as fast as the C++ std::sort, and outperforms state of the art architecture-specific algorithms, while being portable across all modern CPU architectures. Below we discuss how we achieved this.
[0] https://github.com/google/highway
[1] https://opensource.googleblog.com/2022/06/Vectorized%20and%2..., which has an associated paper at https://arxiv.org/pdf/2205.05982.pdf.
Gemma.cpp: lightweight, standalone C++ inference engine for Gemma models
7 projects | news.ycombinator.com | 23 Feb 2024

Thanks so much!
Everyone working on this self-selected into contributing, so I think of it less as my team than ... a team?
Specifically want to call out: Jan Wassenberg (author of https://github.com/google/highway) and I started gemma.cpp as a small project just a few months ago + Phil Culliton, Dan Zheng, and Paul Chang + of course the GDM Gemma team.
From slow to SIMD: A Go optimization story
10 projects | news.ycombinator.com | 23 Jan 2024

C++ users can enjoy Highway [1].
[1] https://github.com/google/highway/
GDlog: A GPU-Accelerated Deductive Engine
16 projects | news.ycombinator.com | 3 Dec 2023
Designing a SIMD Algorithm from Scratch
3 projects | news.ycombinator.com | 28 Nov 2023

At that point it is better to have some kind of DSL that should not be in the main language, because it would target a much lower level than a typical program. The best effort I've seen in this scene was Google's Highway [1] (not to be confused with HighwayHash) and I even once attempted to recreate it in Rust, but it is still distanced from my ideal.
[1] https://github.com/google/highway
SIMD Everywhere Optimization from ARM Neon to RISC-V Vector Extensions
6 projects | news.ycombinator.com | 29 Sep 2023

Interesting, thanks for sharing :)
At the time we open-sourced Highway, the standardization process had already started and there were some discussions.
I'm curious why stdlib is the only path you see to default? Compare the activity level of https://github.com/VcDevel/std-simd vs https://github.com/google/highway. As to open-source usage, after years of std::experimental, I see <200 search hits [1], vs >400 for Highway [2], even after excluding several library users.
But that aside, I'm not convinced standardization is the best path for a SIMD library. We and external users extend Highway on a weekly basis as new use cases arise. What if we deferred those changes to 3-monthly meetings, or had to wait for one meeting per WD, CD, (FCD), DIS, (FDIS) stage before it's standardized? Standardization seems more useful for rarely-changing things.
1: https://sourcegraph.com/search?q=context:global+std::experim...
2: https://sourcegraph.com/search?q=context:global+HWY_NAMESPAC...
Permuting Bits with GF2P8AFFINEQB
1 project | news.ycombinator.com | 27 Sep 2023

Thanks for the link. We were previously using GFNI for bit reversal and 8-bit shifts, and I just extended that to our 8-bit BroadcastSignBit (https://github.com/google/highway/pull/1784).
Six times faster than C
4 projects | news.ycombinator.com | 6 Jul 2023

You could study Google's Highway library [1].
[1] https://github.com/google/highway
AMD EPYC 97x4 “Bergamo” CPUs: 128 Zen 4c CPU Cores for Servers, Shipping Now
1 project | news.ycombinator.com | 24 Jun 2023

Runtime feature detection need not be rare nor hard, it's a few dozen lines of boilerplate. You can even write your code just once: see https://github.com/google/highway#examples.

What are some alternatives?

When comparing ispc and highway you can also consider the following projects:

Beef - Beef Programming Language

xsimd - C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

ParallelReductionsBenchmark - Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!

Vc - SIMD Vector Classes for C++

micro-profiler - Cross-platform low-footprint realtime C/C++ Profiler

swup - Versatile and extensible page transition library for server-rendered websites 🎉

elena-lang - ELENA is a general-purpose language with late binding. It is multi-paradigm, combining features of functional and object-oriented programming. Rich set of tools are provided to deal with message dispatching : multi-methods, message qualifying, generic message handlers, run-time interfaces

DirectXMath - DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

lunix - Lua Unix Module.

riscv-v-spec - Working draft of the proposed RISC-V V vector extension

eve - Expressive Vector Engine - SIMD in C++ Goes Brrrr

jpeg-xl

ispc vs Beef highway vs xsimd ispc vs ParallelReductionsBenchmark highway vs Vc ispc vs micro-profiler highway vs swup ispc vs elena-lang highway vs DirectXMath ispc vs lunix highway vs riscv-v-spec ispc vs eve highway vs jpeg-xl

Compare ispc vs highway and see what are their differences.

ispc

highway

ispc

highway

What are some alternatives?