corundum
hls4ml
Our great sponsors
corundum | hls4ml | |
---|---|---|
28 | 11 | |
1,460 | 1,103 | |
3.7% | 5.8% | |
9.4 | 9.1 | |
4 months ago | 5 days ago | |
Verilog | C++ | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
corundum
-
FuryGpu – Custom PCIe FPGA GPU
The GPU uses this: https://github.com/alexforencich/verilog-pcie . And there is an open-source 100G NIC here, including open source 10G/25G MACs: https://github.com/corundum/corundum
- Open sourceCorundum – FPGA-based NIC and platform for in-network compute
- TCP checksum computation
- Are there any free/open source Lattice ECP5 Ethernet MAC IP Cores?
-
xilinx versal gty testbench/data gen?
Well, I did build this: https://github.com/corundum/corundum
-
FPGA for finance industry
I would look into 10GbE PCS/MAC packet processors implemented under AXI Stream interfaces for example. There are open source examples https://github.com/corundum/corundum and https://netfpga.org/ .
-
Computer Networking Nerd and EE Student Looking to build a Baremetal Network Driver on top of baremetal kernel? Is this possible and if so, I'd like some guidance!
I built my own 100 Gbps capable NIC, along with driver: https://github.com/corundum/corundum. You're welcome to ask if you have any questions, though it is quite a different animal from a 100 Mbps NIC you might have on a microcontroller.
-
Device Drivers for Transceiver Questions (Specifically, PCIe)
If you're looking for resources, here's one rather comprehensive example of a high-performance FPGA design with a fully custom DMA engine and driver, that runs on both Xilinx and Intel FPGAs: https://github.com/corundum/corundum
-
shift/concatenate in v/sv
I have no idea, but you're welcome to build the design and look at it yourself: https://github.com/corundum/corundum/tree/master/fpga/mqnic/NetFPGA_SUME/fpga. The barrel shifters are in the DMA engine, both the read DMA and write DMA engines have wide barrel shifters.
-
Open source projects?
Dive right into the slack channel and introduce yourself. There is also a new contributor guide. /u/alexforencich/ is on these reddits and he may be able to chime in and give you more concrete suggestions.
hls4ml
- How to participate in open-source FPGA projects?
- Looking for HLS frameworks to start deploying DL algorithms on FPGAs
-
Hi, What could be the best HLS tool for implementing neural networks on FPGA
I see that someone has already suggested hls4ml. I second that opinion. From my experience, it is extremely well documented. They have published papers which explain the scientific background. They have a really nice git page where they explain all the features of their tool. Additionally they also have an easy to follow tutorial of doing it from scratch using tensorflow networks. You can find all the information herehls4ml.
-
5 layered CNN implementation on arduino/FPGAs [P]
Open source project that originated at Fermilab https://github.com/fastmachinelearning/hls4ml (based on Xilinx Vivado which has been replaced by Vitis)
-
Help needed to build a Hardware accelerator for CNN's
You may check the hls4ml framework: it's a "translator" from the ML model (Keras, PyTorch) to a synthesizable High-Level Synthesis (HLS) IP Core.
-
Sub ms - 3ms Latency Vision task on FPGA
It really depends on the type of data you are using. There may (or may not) be some trade offs and sacrifices. There are frameworks which can basically translate your neural network information from a high level python code into equivalent HLS code which is optimized for low latency when inferred on FPGAs. Some frameworks which might be useful for you to explore are hls4ml and finn. These are some frameworks which can achieve low latency inference of neural networks on FPGAs using Xilinx Vitis HLS. These are what I found when I did a similar experiment but with much lower latency target (a few hundred ns) and a very simple MLP with 1D signal as input which was a year ago. Not sure if there are better alternatives available as of 2023. But conceptually all these work on the primary principle of having a supporting framework/methodology to first quantize the network and limit the precision of data to fixed point. The HLS then produced will also be a result of the framework applying dataflow techniques such that the resulting HLS code will produce an RTL which has the best overall latency.
-
looking for resources to design a basic deep learning feed forward accelerator
Check hls4ml. Developed by CERN for fast classification in FPGA for high-energy physics experiments.
-
How to build FPGA-based ML accelerator?
I would check out hls4ml. It's an open source project made by/for people at CERN to convert neural networks created in Python using QKeras (a quantization extension of Keras) into HLS, with Vivado HLS being the most well supported. There are some caveats though, and a fellow student and I have had trouble getting the generated HLS to match the Keras model and be feasible to synthesize, but it seems to work well for smaller neural networks.
-
How are TensorFlow Models implemented on PYNQ's PS & PL
Since you're looking for PL-only implementation, HLS4ML may fit your needs. It was developed to port TensorFlow models directly to FPGAs in particle physics experiments. Current development allows for implementation on SoC and MPSoC, though.
- Open source projects?
What are some alternatives?
verilog-ethernet - Verilog Ethernet components for FPGA implementation
qkeras - QKeras: a quantization deep learning library for Tensorflow Keras
rssguard - Feed reader (and podcast player) which supports RSS/ATOM/JSON and many web-based feed services.
Silice - Silice is an easy-to-learn, powerful hardware description language, that simplifies designing hardware algorithms with parallelism and pipelines.
NvChad - Blazing fast Neovim config providing solid defaults and a beautiful UI, enhancing your neovim experience.
v4l2rtspserver - RTSP Server for V4L2 device capture supporting HEVC/H264/JPEG/VP8/VP9
litex - Build your hardware, easily!
srs - SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.
soft_riscv - Soft-core RISCV processor for RISCV 2018 competition
PipelineC - A C-like hardware description language (HDL) adding high level synthesis(HLS)-like automatic pipelining as a language construct/compiler feature.
psram-tang-nano-9k - An open source PSRAM/HyperRAM controller for Sipeed Tang Nano 9K / Gowin GW1NR-LV9QN88PC6/15 FPGA
fastocloud_com - Self-hosted IPTV/NVR/CCTV/Video service (Community version) [Moved to: https://github.com/fastogt/fastocloud]