Our great sponsors
-
dolphin
Dolphin is a GameCube / Wii emulator, allowing you to play games for these two platforms on PC with improvements.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
ppsspp
A PSP emulator for Android, Windows, Mac and Linux, written in C++. Want to contribute? Join us on Discord at https://discord.gg/5NJB6dD or just send pull requests / issues. For discussion use the forums at forums.ppsspp.org.
It was a bit more complicated than that: the issue was that one of the early big.LITTLE designs (Samsung's Exynos 8890) had different cacheline sizes on the big and little cores.
glibc's `__clear_cache` would cache the cacheline size on first call, so if the program was started on a big core then migrated onto a little core it would only flush every other cacheline.
And the mitigation was not to "change its allocation", it was to bypass libgcc and handroll cache clearing: https://github.com/mono/mono/pull/3549
Source: https://www.mono-project.com/news/2016/09/12/arm64-icache/
And this issue didn't only affect Mono, it affected pretty much any JIT running on that phone e.g. dolphin (https://github.com/dolphin-emu/dolphin/pull/4204) and ppsspp (https://github.com/hrydgard/ppsspp/pull/8965) had been hitting the same issue and adopted mono's fix, PPSSPP hadn't been able to find the root cause so they'd originally implemented a gnarly hack by adding a bunch of padding (https://github.com/hrydgard/ppsspp/pull/8769).
But fundamentally this is the 8890 being broken: as the Mono post notes, technically nothing precludes core migration in the middle of clearing the cache, which would also lead to broken behaviour, with no mitigation.
It was a bit more complicated than that: the issue was that one of the early big.LITTLE designs (Samsung's Exynos 8890) had different cacheline sizes on the big and little cores.
glibc's `__clear_cache` would cache the cacheline size on first call, so if the program was started on a big core then migrated onto a little core it would only flush every other cacheline.
And the mitigation was not to "change its allocation", it was to bypass libgcc and handroll cache clearing: https://github.com/mono/mono/pull/3549
Source: https://www.mono-project.com/news/2016/09/12/arm64-icache/
And this issue didn't only affect Mono, it affected pretty much any JIT running on that phone e.g. dolphin (https://github.com/dolphin-emu/dolphin/pull/4204) and ppsspp (https://github.com/hrydgard/ppsspp/pull/8965) had been hitting the same issue and adopted mono's fix, PPSSPP hadn't been able to find the root cause so they'd originally implemented a gnarly hack by adding a bunch of padding (https://github.com/hrydgard/ppsspp/pull/8769).
But fundamentally this is the 8890 being broken: as the Mono post notes, technically nothing precludes core migration in the middle of clearing the cache, which would also lead to broken behaviour, with no mitigation.
It was a bit more complicated than that: the issue was that one of the early big.LITTLE designs (Samsung's Exynos 8890) had different cacheline sizes on the big and little cores.
glibc's `__clear_cache` would cache the cacheline size on first call, so if the program was started on a big core then migrated onto a little core it would only flush every other cacheline.
And the mitigation was not to "change its allocation", it was to bypass libgcc and handroll cache clearing: https://github.com/mono/mono/pull/3549
Source: https://www.mono-project.com/news/2016/09/12/arm64-icache/
And this issue didn't only affect Mono, it affected pretty much any JIT running on that phone e.g. dolphin (https://github.com/dolphin-emu/dolphin/pull/4204) and ppsspp (https://github.com/hrydgard/ppsspp/pull/8965) had been hitting the same issue and adopted mono's fix, PPSSPP hadn't been able to find the root cause so they'd originally implemented a gnarly hack by adding a bunch of padding (https://github.com/hrydgard/ppsspp/pull/8769).
But fundamentally this is the 8890 being broken: as the Mono post notes, technically nothing precludes core migration in the middle of clearing the cache, which would also lead to broken behaviour, with no mitigation.
> On a single machine I can just replace an `.iter()` into a `.par_iter()` in my Rust programs and boom, I parallelized my workload across every core with only a single line change.
If you’re using Rayon for anything you expect to benefit from a lot of cores, my experiences have been very negative in the past. Rayon is extremely inefficient once you allow it to use more than a handful of cores.[0]
If your unit of work is absolutely massive, you might not be as affected by Rayon’s scheduler, but it makes me sad that this issue still hasn’t been resolved. My comment on that thread was almost 4 years ago! And the issue itself is almost a year older than that.
I no longer consider Rayon to be an advantage for Rust. There are plenty of good ways to wire up a parallel work pool, but they do require more than a one line change.
[0]: https://github.com/rayon-rs/rayon/issues/394
Related posts
- How exactly does Unity integrate with IDEs - how does the editor build work?
- Is there anything inherently wrong with .net applications for self-hosting? (especially in terms of privacy)
- Mono: A Simple UI/Web/Desktop/Mobile Framework Written in Nim
- How do I get the target framework assemblies for version 4.6.2 (Or any version) on Linux?
- Ich werde niemals auf Proprietäre r Basis software entwickeln