samsara
are-we-fast-yet
samsara | are-we-fast-yet | |
---|---|---|
6 | 18 | |
64 | 315 | |
- | - | |
10.0 | 8.8 | |
over 1 year ago | 3 months ago | |
Rust | Java | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
samsara
-
Garbage Collection for Systems Programmers
> IME it's the other way around, per-object individual lifetimes is a rare special case
It depends on your application domain. But in most cases where objects have "individual lifetimes" you can still use reference counting, which has lower latency and memory overhead than tracing GC and interacts well with manual memory management. Tracing GC can then be "plugged in" for very specific cases, preferably using a high performance concurrent implementation much like https://github.com/chc4/samsara (for Rust) or https://github.com/pebal/sgcl (for C++).
-
Why choose async/await over threads?
> Just for example: "it needs a GC" could be the heart of such an argument
Rust can actually support high-performance concurrent GC, see https://github.com/chc4/samsara for an experimental implementation. But unlike other languages it gives you the option of not using it.
-
Boehm Garbage Collector
The compiler support you need is quite limited. Here's an implementation of cycle collection in Rust: https://github.com/chc4/samsara It's made possible because Rust can tell apart read-only and read-write references (except for interior mutable objects, but these are known to the compiler and references to them can be treated as read-write). This avoids a global stop-the-world for the entire program.
Cascading deletes are rare in practice, and if anything they are inherent to deterministic deletion, which is often a desirable property. When they're possible, one can often use arena allocation to avoid the issue altogether, since arenas are managed as a single object.
-
Steel – An embedded scheme interpreter in Rust
There are concurrent GC implementations for Rust, e.g. Samsara https://redvice.org/2023/samsara-garbage-collector/ https://github.com/chc4/samsara that avoid blocking, except to a minimal extent in rare cases of contention. That fits pretty well with the pattern of "doing a bit of GC every frame".
-
Removing Garbage Collection from the Rust Language (2013)
There are a number of efforts along these lines, the most interesting is probably Samsara https://github.com/chc4/samsara https://redvice.org/2023/samsara-garbage-collector/ which implements a concurrent, thread-safe GC with no global "stop the world" phase.
-
I built a garbage collector for a language that doesn't need one
Nice blog post! I also wrote a concurrent reference counted cycle collector in Rust (https://github.com/chc4/samsara) though never published it to crates.io. It's neat to see the different choices that people made implementing similar goals, and dumpster works pretty differently from how I did it. I hit the same problems wrt concurrent mutation of the graph when trying to count in-degree of nodes, or adding references during a collection - I didn't even think of doing generational references and just have a RwLock...
are-we-fast-yet
-
Boehm Garbage Collector
> Sure there's a small overhead to smart pointers
Not so small, and it has the potential to significantly speed down an application when not used wisely. Here are e.g. some measurements where the programmer used C++11 and did everything with smart pointers: https://github.com/smarr/are-we-fast-yet/issues/80#issuecomm.... There was a speed down between factor 2 and 10 compared with the C++98 implementation. Also remember that smart pointers create memory leaks when used with circular references, and there is an additional memory allocation involved with each smart pointer.
> Garbage collection has an overhead too of course
The Boehm GC is surprisingly efficient. See e.g. these measurements: https://github.com/rochus-keller/Oberon/blob/master/testcase.... The same benchmark suite as above is compared with different versions of Mono (using the generational GC) and the C code (using Boehm GC) generated with my Oberon compiler. The latter only is 20% slower than the native C++98 version, and still twice as fast as Mono 5.
-
A C++ version of the Are-we-fast-yet benchmark suite
See https://github.com/smarr/are-we-fast-yet/blob/master/docs/guidelines.md.
-
The Bitter Truth: Python 3.11 vs. Cython vs. C++ Performance for Simulations
That's a very interesting article, thanks. Interesting to note that Cython is only about twice as fast as Python 3.10 and only about 40% faster than Python 3.11.
The official Python site advertises a speedup of 25% from 3.10 to 3.11; in the article a speedup of 60% was measured. It therefore usually makes sense to measure different algorithms. Unfortunately there is no Python or C++ implementation yet for https://github.com/smarr/are-we-fast-yet.
- Comparing Language Implementations with Objects, Closures, and Arrays
- Are We Fast Yet? Comparing Language Implementations with Objects, Closures, and Arrays
-
.NET 6 vs. .NET 5: up to 40% speedup
> Software benchmarks are super subjective.
No, they are not, but they are just a measurement tool, not a source of absolute thruth. When I studied engineering at ETH we learned "Who measures measures rubbish!" ("Wer misst misst Mist!" in German). Every measurement has errors and being aware of these errors and coping with it is part of the engineering profession. The problem with programming language benchmarks is often that the goal is to win by all means; to compare as fairly and objectively as possible instead, there must be a set of suitable rules adhered to by all benchmark implementations. Such a set of rules is e.g. given for the Are-we-fast-yet suite (https://github.com/smarr/are-we-fast-yet).
-
Is CoreCLR that much faster than Mono?
I am aware of the various published test results where CoreCLR shows fantastic speed-ups compared to Mono, e.g. when calculating MD5 or SHA hash sums.
But my measurements based on the Are-we-fast-yet benchmark suite (see https://github.com/smarr/are-we-fast-yet and https://github.com/rochus-keller/Oberon/tree/master/testcases/Are-we-fast-yet) show a completely different picture. Here the difference between Mono and CoreCLR (both versions 3 and 5) is within +/- 10%, so nothing earth shattering.
Here are my measurement results:
https://github.com/rochus-keller/Oberon/blob/master/testcases/Are-we-fast-yet/Are-we-fast-yet_results_linux.pdf comparing the same benchmark on the same machine run under LuaJIT, Mono, Node.js and Crystal.
https://github.com/rochus-keller/Oberon/blob/master/testcases/Are-we-fast-yet/Are-we-fast-yet_results_windows.pdf comparing Mono, .Net 4 and CoreCLR 3 and 5 on the same machine.
Here are the assemblies of the Are-we-fast-yet benchmark suite used for the measurements, in case you want to reproduce my results: http://software.rochus-keller.ch/Are-we-fast-yet_CLI_2021-08-28.zip.
I was very surprised by the results. Perhaps it has to do with the fact that I measured on x86, or that the benchmark suite used includes somewhat larger (i.e. more representative) applications than just micro benchmarks.
What are your opinions? Do others have similar results?
-
Is CoreCLR really that much faster than Mono?
There is a good reason for this; have a look at e.g. https://github.com/smarr/are-we-fast-yet/blob/master/docs/guidelines.md.
-
Why most programming language performance comparisons are most likely wrong
Then apparently the SOM nbody program is taken as the basis of a new Java nbody program.
What are some alternatives?
sundial-gc - WIP: my Tweag open source fellowship project
gleam - ⭐️ A friendly language for building type-safe, scalable systems!
nitro - Experimental OOP language that compiled to native code with non-fragile and stable ABI
crystal - The Crystal Programming Language
gara
fast-ruby - :dash: Writing Fast Ruby :heart_eyes: -- Collect Common Ruby idioms.
patty - A pattern matching library for Nim
PyCall.jl - Package to call Python functions from the Julia language
node-libnmap - API to access nmap from node.js
Oberon - Oberon parser, code model & browser, compiler and IDE with debugger
qcell - Statically-checked alternatives to RefCell and RwLock
Smalltalk - Parser, code model, interpreter and navigable browser for the original Xerox Smalltalk-80 v2 sources and virtual image file