Compute Express Link CXL Latency How Much Is Added at HC34 (2022)

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

core-to-core-latency

11 934 1.8 Jupyter Notebook

Measures the latency between CPU cores

Very close to the point where SMT/HyperThreading might be enough, where we can just soak the latency & treat it basically like main memory. I would not be shocked to see SMT3 or SMT4 show up, once we see massively many core scale out cpus with gobs of memory. Load stores take longer, so pipelines stall, so you want to be able to keep the core busy by switching to external work.
Also the pyramid in the diagram is somewhat sunny a picture. I'd love some better numbers to stare at. But on an 1p AMD Milan core to core latency across an 8 core CCX is low 20ns latency. Thays tiny! Accessing memory on any other CCX has to go off the CCX to the IOD and back, which is high 80s to 110ns latency. This example is from an aws c6a.metal. https://github.com/nviennot/core-to-core-latency#amd-epyc-7r...
Intel Ice Lake (c6i.metal), being monolithic, starts way worse. Any communication has to traverse a shared ring bus and thus takes 40-65ns. https://github.com/nviennot/core-to-core-latency#intel-xeon-...
M1 Pro is neat. An 8c chip has three "CCX" alike complexes, 2 perf of 3c each and 1 efficiency of 2c. Smart. Latency is 40ns across cluster, 150 outside cluster. https://github.com/nviennot/core-to-core-latency#apple-m1-pr...
Doing anything off the first socket on AMD is terrible. 90-110ns across CCX on the same CCD, but any communication involving the 2ns CCX is a staggering 190ns to 210ns. https://github.com/nviennot/core-to-core-latency#dual-amd-ep... That's around what the pyramid shows as the upper end for CXL memory (170-250ns).
Please also kindly note these figures probably don't scale linearly with core clockspeed but they probably do scale somewhat & so direct comparison is inadvised. But it's good interesting data showing some very contemporary latency situations deep in the heart of computing that CXL is unlikely to be better than.
Using core to core is a weird proxy but illuminative of how complex & odd it is providing system connectivity is to the smaller CCX core clusters. More on point is talking about main memory latency. Anandtech has great coverage of core to core, and also crucially main memory latency too. There's a lot of nuance & config variance here (NPS0-4) but there's generally a regime where a cluster can be getting around 12ns access, but it can very quickly ramp up to 110-130ns if trying to access wide ranges of memory. It starts to look like a core to core grade speed hit. https://www.anandtech.com/show/16529/amd-epyc-milan-review/4
Notably the IOD is basically a Northbridge controller connecting all the individual CCX clusters: key for the talking to other clusters, key to the to talking to memory, key to the exposing PCIe/CXL. If core to core is 150ns say, it's well possible CXL's additional overhead could actually be quite marginal! Maybe, or not, maybe it will be entirely on top of this hit; too early to tell probably.
My gut feel is this pyramid is off. The peak is not as fast as they make it look today. But what exactly that means for CXL's latency is unknown.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Machine Learning Models: Linear Regression

2 projects | dev.to | 3 May 2024
Finetune a GPT Model for Spam Detection on Your Laptop in Just 5 Minutes

1 project | news.ycombinator.com | 3 May 2024
AI Agent Notebooks Using LangChain, LlamaIndex, Milvus, and More

1 project | news.ycombinator.com | 3 May 2024
Mastering Dataset Acquisition: A Comprehensive Guide

2 projects | dev.to | 3 May 2024
Simple GitHub Issue Handled(?) By Copilot Workspace

1 project | news.ycombinator.com | 2 May 2024

Compute Express Link CXL Latency How Much Is Added at HC34 (2022)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 7 Jun 2023

core-to-core-latency

InfluxDB

Related posts

Machine Learning Models: Linear Regression

Finetune a GPT Model for Spam Detection on Your Laptop in Just 5 Minutes

AI Agent Notebooks Using LangChain, LlamaIndex, Milvus, and More

Mastering Dataset Acquisition: A Comprehensive Guide

Simple GitHub Issue Handled(?) By Copilot Workspace