sse2neon
aws-graviton-getting-started
sse2neon | aws-graviton-getting-started | |
---|---|---|
7 | 62 | |
1,224 | 817 | |
1.2% | 1.1% | |
7.3 | 8.5 | |
14 days ago | 2 days ago | |
C++ | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sse2neon
- sse2neon - A C/C++ header file that converts Intel SSE intrinsics to Aarch64 NEON intrinsic
- A C/C++ header file that converts Intel SSE intrinsics to Aarch64 NEON intrinsic
-
Porting Architecture Specific C/C++ Intrinsics to Graviton
The sse2neon project is a quick way to get C/C++ applications compiling and running on Graviton. The sse2neon header file provides NEON implementations for x64 intrinsics so no source code changes are needed. Each function call (intrinsic) is simply replaced with NEON instructions and will just work on Graviton.
-
An AWS Community Builder Story
To continue our collaboration I contributed some small changes to KasmVNC on GitHub to use sse2neon for a performance critical part of the application which uses SSE intrinsics and needed to be changed to NEON intrinsics.
-
Deserializing JSON Fast
I think the talk is very clearly laid out as an incremental journey, and each stepping stone involves contextual decision-making. I don't think Andreas is saying "you must end up with the SSE2 implementation at the end". Using machine-specific intrinsics is another dependency decision very similar to deciding to use a given library. I would have loved the talk and probably still thought of it and posted it, even if it ended before the intrinsics (but I think he does an excellent job at that part too).
And porting SSE2 to Neon is actually pretty easy -- if you use https://github.com/DLTcollab/sse2neon, IME it's very easy to do incrementally (or avoid or postpone indefinitely, depending on your needs).
-
PortableGL: An MIT licensed implementation of OpenGL 3.x-ish in clean C
I have a private cross-platform port, I’m waiting on the resolution of his latest GitHub issue to submit my changes. sse2neon (https://github.com/DLTcollab/sse2neon) was a big help - I also wrote a very primitive sse2scalar for raspbian builds where neon is unavailable. Honestly SIMD doesn’t help much, as you’re usually memory bound under SWGL. The biggest perf win is any amount of asynchronous execution - running off the main thread is good enough and could be applied to your library externally through a command buffer without any changes to your code.
-
Success porting VCV into aarch64 linux! (Usable on Android Devices)
You should go to /include/simd and download sse2neon.h into the folder. Replace appearing in any source files in that directory with "sse2neon.h". You will still encounter errors; remove the lines causing problems, typically containing the phrase ZERO_MODE. ARM processors does not require it.
aws-graviton-getting-started
- AWS Graviton Technical Guide
- Cómo comenzar a trabajar con AWS Graviton: La pregunta del Millón
-
What infra did you deploy for Iceberg/Hudi/Delta?
EMR serverless + Athena + Glue works for us. We are evaluating Graviton instance to further optimize stuff. AWS link if you are interested
-
Slash CAPEX, OPEX, and Carbon Emissions with T408
Now we turn our attention to carbon emissions which are presented in Table 8. In the table, the AMD – CPU only and AMD – T408 server watts/hour are actual measurements on the test system during operation. To estimate the AWS server watts/hour, we reduced the CPU-only AMD number by 60%, which is the savings that Amazon claims that Graviton3 CPUs provide over other CPUs. In all three cases, we multiplied this by the number of servers, then hours, days, and years, to compute the three-year power consumption total.
-
Framework ARM
https://aws.amazon.com/ec2/graviton/ https://cloud.google.com/compute/docs/instances/arm-on-compute
-
Google Has Developed Its Own Data Center Server Chips
From the relevant product page [0]:
"AWS Graviton3 processors feature always-on memory encryption, dedicated caches for every vCPU, and support for pointer authentication."
Further reading on pointer authentication [1].
[0] https://aws.amazon.com/ec2/graviton/
[1] https://www.qualcomm.com/content/dam/qcomm-martech/dm-assets...
-
can i repurpose a server and make it a computer
Amazon makes their own Arm CPUs, like the Graviton3: https://aws.amazon.com/ec2/graviton/
-
Cost Cutting AWS strategies
Read More about Graviton Processors
-
Blackberry Partnership Panning Out!
According to BlackBerry, both QNX and IVY can run on EC2 instances powered by AWS’ Graviton2 processor. Graviton2 is an internally-developed processor that AWS debuted at re:Invent last year. It promises to provide up to 40% better price performance than comparable chips. "
- AWS Graviton
What are some alternatives?
yenten-arm-miner-yespowerr16 - ARM 64 CPU miner for Yespower variant algorithms
drupal-pi - Drupal on Docker on a Raspberry Pi. Pi Dramble's little brother.
KasmVNC - Modern VNC Server and client, web based and secure
simde - Implementations of SIMD instruction sets for systems which don't natively support them.
buildx - Docker CLI plugin for extended build capabilities with BuildKit
Tow-Boot - An opinionated distribution of U-Boot. — https://matrix.to/#/#Tow-Boot:matrix.org?via=matrix.org
examples - TensorFlow examples
libsamplerate - An audio Sample Rate Conversion library
sysbench - Scriptable database and system performance benchmark
cglm - 📽 Highly Optimized 2D / 3D Graphics Math (glm) for C
examples - A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.