amaranth
cunumeric
amaranth | cunumeric | |
---|---|---|
7 | 9 | |
1,436 | 595 | |
1.3% | 0.0% | |
9.7 | 8.5 | |
10 days ago | 1 day ago | |
Python | Python | |
BSD 2-clause "Simplified" License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
amaranth
-
Why are there only 3 languages for FPGA development?
He probably meant Amaranth.
-
VRoom A high end RISC-V implementation
As an aside, the latest and active development of nMigen has been rebranded a few months ago to Amaranth and can be found here: https://github.com/amaranth-lang/amaranth . In case people googled nMigen and came to the repository that hasn't been updated in two years.
- NMigen – A Python toolbox for building complex digital hardware (FPGAs)
-
Facts every web dev should know before they burn out and turn to painting
Hmm. A followup question: are there any cheats/hacks that would make it possible (if painful) to for example explore the world of USB3, PCIe, or Linux on low-end-ish ARM (eg https://www.thirtythreeforty.net/posts/2019/12/my-business-c..., based on the 533MHz https://linux-sunxi.org/F1C100s), without needing to buy equipment in the mid-4-figure/low-5-figure range, if I were able to substitute a statistically larger-than-average amount of free time (and discipline)?
For example, I learned about https://github.com/GlasgowEmbedded/glasgow recently, a bit of a niche kitchen sink that uses https://github.com/nmigen/nmigen/ to lower a domain-specific subset of Python 3 (https://nmigen.info/nmigen/latest/lang.html) into Verilog which then runs on the Glasgow board's iCE40HX8K. The project basically makes it easier to use cheap FPGAs for rapid iteration. (The README makes a point that the synthesis is sufficiently fast that caching isn't needed.)
In certain extremely specific situations where circumstances align perfectly (caveat emptor), devices like this can sometimes present a temporary escape to the inevitable process of acquiring one's first second-hand high-end oscilloscope (fingers-crossed the expensive bits still have a few years left in them). To some extent they may also commoditize the exploration of very high-speed interfaces, which are rapidly becoming a commonplace principal of computers (eg, having 10Gbps everywhere when USB3.1 hits market saturation will be interesting) faster than test and analysis kit can keep up (eg to do proper hardware security analysis work). The Glasgow is perhaps not quite an answer to that entire statement, but maybe represents beginning steps in that sort of direction.
So, to reiterate - it's probably an unhelpfully broad question, and I'm still learning about the field so haven't quite got the preciseness I want yet, but I'm curious what gadgetry, techniques, etc would perhaps allow someone to "hack it" and dive into this stuff on a shoestring budget? :)
-
Awesome Lattice FPGA Boards
Worth knowing that are two "nmigen"s nowadays - the one originated in M-Labs and one under a project also called nmigen:
https://github.com/nmigen/nmigen
It's a fork, made for reasons, but more actively developed. whitequark (long time author/contributor) works on this fork, and no longer the M-Labs version.
- Chisel/Firrtl Hardware Compiler Framework
-
Unifying the CUDA Python Ecosystem
Sounds like nmigen might be a good open source successor to the project that you describe: https://github.com/nmigen/nmigen
cunumeric
- Announcing Chapel 1.32
-
Is Parallel Programming Hard, and, If So, What Can You Do About It? [pdf]
I am biased because this is my research area, but I have to respectfully disagree. Actor models are awful, and the only reason it's not obvious is because everything else is even more awful.
But if you look at e.g., the recent work on task-based models, you'll see that you can have literally sequential programs that parallelize automatically. No message passing, no synchronization, no data races, no deadlocks. Read your programs as if they're sequential, and you immediately understand their semantics. Some of these systems are able to scale to thousands of nodes.
An interesting example of this is cuNumeric, which allows you to take sequential Python programs that use NumPy, and by changing one line (the import statement), run automatically on clusters of GPUs. It is 100% pure awesomeness.
https://github.com/nv-legate/cunumeric
(I don't work on cuNumeric, but I do work on the runtime framework that cuNumeric uses.)
-
GPT in 60 Lines of NumPy
I know this probably isn't intended for performance, but it would be fun to run this in cuNumeric [1] and see how it scales.
[1]: https://github.com/nv-legate/cunumeric
-
Dask – a flexible library for parallel computing in Python
If you want built-in GPU support (and distributed), you should check out cuNumeric (released by NVIDIA in the last week or so). Also avoids needing to manually specify chunk sizes, like it says in a sibling comment.
https://github.com/nv-legate/cunumeric
-
Julia is the better language for extending Python
Try dask
Distribute your data and run everything as dask.delayed and then compute only at the end.
Also check out legate.numpy from Nvidia which promises to be a drop in numpy replacement that will use all your CPU cores without any tweaks on your part.
https://github.com/nv-legate/legate.numpy
-
Learning more about HPC as a python guy
Something for the HPC tools category: https://github.com/nv-legate/legate.numpy
-
Unifying the CUDA Python Ecosystem
You might be interested in Legate [1]. It supports the NumPy interface as a drop-in replacement, supports GPUs and also distributed machines. And you can see for yourself their performance results; they're not far off from hand-tuned MPI.
[1]: https://github.com/nv-legate/legate.numpy
Disclaimer: I work on the library Legate uses for distributed computing, but otherwise have no connection.
- Legate NumPy: An Aspiring Drop-In Replacement for NumPy at Scale
What are some alternatives?
SpinalHDL - Scala based HDL
cupy - NumPy & SciPy for GPU
cocotb - cocotb, a coroutine based cosimulation library for writing VHDL and Verilog testbenches in Python
CudaPy - CudaPy is a runtime library that lets Python programmers access NVIDIA's CUDA parallel computation API.
chisel - Chisel: A Modern Hardware Design Language
CUDA.jl - CUDA programming in Julia.
chiselverify - A dynamic verification library for Chisel.
numba - NumPy aware dynamic Python compiler using LLVM
myhdl - The MyHDL development repository
legate.pandas - An Aspiring Drop-In Replacement for Pandas at Scale
pygears - HW Design: A Functional Approach
grcuda - Polyglot CUDA integration for the GraalVM