core-to-core-latency vs ipc-bench

core-to-core-latency

Measures the latency between CPU cores (by nviennot)

Suggest topics

Source Code

Suggest alternative

Edit details

ipc-bench

:racehorse: Benchmarks for Inter-Process-Communication Techniques (by goldsborough)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

core-to-core-latency		ipc-bench
	Project
11	Mentions	5
934	Stars	642
-	Growth	-
1.8	Activity	0.0
over 1 year ago	Latest Commit	about 2 years ago
Jupyter Notebook	Language	C
MIT License	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

core-to-core-latency

Posts with mentions or reviews of core-to-core-latency. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-08-20.

Show HN: Visualize core-to-core latency on Linux in ~200 lines of C and Python
2 projects | news.ycombinator.com | 20 Aug 2023

The project is a port of https://github.com/nviennot/core-to-core-latency from Rust to C.
Compute Express Link CXL Latency How Much Is Added at HC34 (2022)
1 project | news.ycombinator.com | 7 Jun 2023

Very close to the point where SMT/HyperThreading might be enough, where we can just soak the latency & treat it basically like main memory. I would not be shocked to see SMT3 or SMT4 show up, once we see massively many core scale out cpus with gobs of memory. Load stores take longer, so pipelines stall, so you want to be able to keep the core busy by switching to external work.
Also the pyramid in the diagram is somewhat sunny a picture. I'd love some better numbers to stare at. But on an 1p AMD Milan core to core latency across an 8 core CCX is low 20ns latency. Thays tiny! Accessing memory on any other CCX has to go off the CCX to the IOD and back, which is high 80s to 110ns latency. This example is from an aws c6a.metal. https://github.com/nviennot/core-to-core-latency#amd-epyc-7r...
Intel Ice Lake (c6i.metal), being monolithic, starts way worse. Any communication has to traverse a shared ring bus and thus takes 40-65ns. https://github.com/nviennot/core-to-core-latency#intel-xeon-...
M1 Pro is neat. An 8c chip has three "CCX" alike complexes, 2 perf of 3c each and 1 efficiency of 2c. Smart. Latency is 40ns across cluster, 150 outside cluster. https://github.com/nviennot/core-to-core-latency#apple-m1-pr...
Doing anything off the first socket on AMD is terrible. 90-110ns across CCX on the same CCD, but any communication involving the 2ns CCX is a staggering 190ns to 210ns. https://github.com/nviennot/core-to-core-latency#dual-amd-ep... That's around what the pyramid shows as the upper end for CXL memory (170-250ns).
Please also kindly note these figures probably don't scale linearly with core clockspeed but they probably do scale somewhat & so direct comparison is inadvised. But it's good interesting data showing some very contemporary latency situations deep in the heart of computing that CXL is unlikely to be better than.
Using core to core is a weird proxy but illuminative of how complex & odd it is providing system connectivity is to the smaller CCX core clusters. More on point is talking about main memory latency. Anandtech has great coverage of core to core, and also crucially main memory latency too. There's a lot of nuance & config variance here (NPS0-4) but there's generally a regime where a cluster can be getting around 12ns access, but it can very quickly ramp up to 110-130ns if trying to access wide ranges of memory. It starts to look like a core to core grade speed hit. https://www.anandtech.com/show/16529/amd-epyc-milan-review/4
Notably the IOD is basically a Northbridge controller connecting all the individual CCX clusters: key for the talking to other clusters, key to the to talking to memory, key to the exposing PCIe/CXL. If core to core is 150ns say, it's well possible CXL's additional overhead could actually be quite marginal! Maybe, or not, maybe it will be entirely on top of this hit; too early to tell probably.
My gut feel is this pyramid is off. The peak is not as fast as they make it look today. But what exactly that means for CXL's latency is unknown.
Intel Linux Kernel Optimizations Show Huge Benefit For High Core Count Servers
1 project | /r/linux | 30 Mar 2023

Yeah but then you run into NUMA boundaries, and it's just a whole headache. Even cores within the same CPU have different speeds with communicating with each other that can make multithreading less efficient. https://github.com/nviennot/core-to-core-latency
Measuring CPU core-to-core latency
1 project | /r/patient_hackernews | 18 Sep 2022

1 project | /r/hackernews | 18 Sep 2022
Core-to-core latencies of the AMD EPYC Milan, 3rd gen
4 projects | /r/Amd | 18 Sep 2022
Measuring core-to-core latency (in Rust)
1 project | /r/hypeurls | 18 Sep 2022

4 projects | news.ycombinator.com | 18 Sep 2022
A tool to measure core-to-core latencies in Rust
1 project | /r/rust | 18 Sep 2022
Analysis of core-to-core latencies
2 projects | /r/intel | 18 Sep 2022

ipc-bench

Posts with mentions or reviews of ipc-bench. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-09.

IPC – Unix Signals
2 projects | news.ycombinator.com | 9 Oct 2023

Anyone who thinks they understand unix signals is fooling themselves. Anyway, the basis of the claim that you can exchange half a million small messages per second using signals is misunderstanding. The benchmark suite in question passes no data, it only ping-pongs the signal.
https://github.com/goldsborough/ipc-bench
Hey Rustaceans! Got a question? Ask here (6/2023)!
6 projects | /r/rust | 8 Feb 2023

Having now done some benchmarking, I would like to use shared_memory.
Measuring core-to-core latency (in Rust)
4 projects | news.ycombinator.com | 18 Sep 2022

I only use AF_UNIX sockets when I need to pass open file handles between processes. I generally prefer message queues: https://linux.die.net/man/7/mq_overview
I haven’t measured myself, but other people did, and they found the latency of message queues is substantially lower: https://github.com/goldsborough/ipc-bench
High performance task/job submission between C# UI and C++ Backend
1 project | /r/cpp_questions | 1 Mar 2021

After reading your post, it seems as if you are interested in something called Inter Process Communication (IPC). There are as many solutions to this as there are birds in the sky (not really, but almost). If you want _speed_, take a look at this benchmark comparison: https://github.com/goldsborough/ipc-bench.

What are some alternatives?

When comparing core-to-core-latency and ipc-bench you can also consider the following projects:

c2clat - A tool to measure CPU core to core latency

multichase

hashbrown - Rust port of Google's SwissTable hash map

core-to-core-latency - Visualize core-to-core communication latency

Cargo - The Rust package manager

ryzen_smu - A Linux kernel driver that exposes access to the SMU (System Management Unit) for certain AMD Ryzen Processors. Read only mirror of https://gitlab.com/leogx9r/ryzen_smu

MicroBenchX - Micro benchmarks CPU/GPU

CoreFreq - CoreFreq : CPU monitoring and tuning software designed for 64-bit processors.

core-to-core-latency vs c2clat ipc-bench vs c2clat core-to-core-latency vs multichase ipc-bench vs hashbrown core-to-core-latency vs core-to-core-latency ipc-bench vs Cargo core-to-core-latency vs ryzen_smu ipc-bench vs multichase core-to-core-latency vs MicroBenchX core-to-core-latency vs CoreFreq

Compare core-to-core-latency vs ipc-bench and see what are their differences.

core-to-core-latency

ipc-bench

core-to-core-latency

ipc-bench

What are some alternatives?