intel-extension-for-transformers
xbyak_aarch64
intel-extension-for-transformers | xbyak_aarch64 | |
---|---|---|
3 | 5 | |
1,970 | 176 | |
4.8% | 0.6% | |
9.9 | 7.0 | |
2 days ago | 5 months ago | |
Python | C++ | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
intel-extension-for-transformers
- Intel Extension for Transformers
- How do you think LLM inference on CPUs?
- 📢Excited to announce https://github.com/intel/intel-extension-for-transformers v1.1 released. Congrats team! 🔥Supported efficient fine-tuning and inference on Xeon SPR and Habana Gaudi 🎯Enabled 4-bits LLM inference on Xeon (better than llama.cpp); improved lm-eval-harness for multiple frameworks
xbyak_aarch64
-
Docker Environment for ARM SVE
You can use the ARM SVE instructions in two different ways. The first is to use intrinsic functions. ARM provides intrinsic function of C language, which is called the Arm C Language Extensions (ACLEs). Another way is to use the ARM SVE instructions directly. However, it is hard to write assemblies directly, so I recommend using a JIT assembler called Xbyak. Xbyak is a JIT assembler developed by MITSUNARI Shigeo. Xbyak was initially developed for x86, but was also released for AArch64.
-
NooDS - A Nintendo DS emulator
Gonna keep an eye on this. The code base looks pretty nice and modern and I think a good speedup would be to recompile the ARM code into x86 using something like xbyak or xbyak_aarch64. Gonna keep an eye on this one and maybe even contribute sometime!
- Xbyak_aarch64: JIT assembler for AArch64 CPUs in C++
What are some alternatives?
diffusion-expert - A software for drawing with stable-diffusion support
xbyak - a JIT assembler for x86(IA-32)/x64(AMD64, x86-64) MMX/SSE/SSE2/SSE3/SSSE3/SSE4/FPU/AVX/AVX2/AVX-512 by C++ header
Stable-Diffusion-NCNN - Stable Diffusion in NCNN with c++, supported txt2img and img2img
sleef - SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
athena - an open-source implementation of sequence-to-sequence based speech processing engine
xbyak_aarch64_handson - Tutorials for ARM SVE on Docker
lightseq - LightSeq: A High Performance Library for Sequence Processing and Generation
Atmosphere - Atmosphère is a work-in-progress customized firmware for the Nintendo Switch.
FasterTransformer - Transformer related optimization, including BERT, GPT
NooDS - A (hopefully!) speedy DS emulator
wenet - Production First and Production Ready End-to-End Speech Recognition Toolkit
mgba - mGBA Game Boy Advance Emulator