glibc
ocaml
glibc | ocaml | |
---|---|---|
45 | 119 | |
1,213 | 5,162 | |
3.2% | 0.7% | |
9.8 | 9.9 | |
9 days ago | 6 days ago | |
C | OCaml | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
glibc
- I cut GTA Online loading times by 70% (2021)
-
Cray-1 performance vs. modern CPUs
I wonder if you’re using a different definition of ‘vectorized’ from the one I would use. For example glibc provides a vectorized strlen. Here is the sse version: https://github.com/bminor/glibc/blob/master/sysdeps/x86_64/m...
It’s pretty simple to imagine how to write an unoptimized version: read a vector from the start of the string, compare it to 0, convert that to a bitvector, test for equal to zero, then loop or clz and finish.
I would call this vectorized because it operates on 16 bytes (sse) at a time.
There are a few issues:
1. You’re still spending a lot of time in the scalar code checking loop conditions.
2. You’re doing unaligned reads which are slower on old processors
3. You may read across a cache line forcing you to pull a second line into cache even if the string ends before then.
4. You may read across a page boundary which could cause a segfault if the next page is not accessible
So the fixes are to do 64-byte (ie cache line) aligned accesses which also means page-aligned (so you won’t read from a page until you know the string doesn’t end in the previous page). That deals with alignment problems. You read four vector registers at a time but this doesn’t really cost much more if the string is shorter as it all comes from one cache line. Another trick in the linked code is that it first finds the cache line by reading the first 16 bytes then merging in the next 3 groups with unsigned-min, so it only requires one test against a zero vector instead of 4. Then it finds the zero in the cache line. You need to do a bit of work in the first iteration to become aligned. With AVX, you can use mask registers on reads to handle that first step instead.
-
Setenv Is Not Thread Safe and C Doesn't Want to Fix It
That was also my thought. To my knowledge `/etc/localtime` is the creation of Arthur David Olson, the founder of the tz database (now maintained by IANA), but his code never read `/etc/localtime` multiple times unless `TZ` environment variable was changed. Tzcode made into glibc but Ulrich Drepper changed it to not cache `/etc/localtime` when `TZ` is unset [1]; I wasn't able to locate the exact rationale, given that the commit was very ancient (1996-12) and no mailing list archive is available for this time period.
[1] https://github.com/bminor/glibc/commit/68dbb3a69e78e24a778c6...
-
CTF Writeup: Abusing select() to factor RSA
That's not really what the problem is. The actual code is fine.
The issue is that the definition of `fd_set` has a constant size [1]. If you allocate the memory yourself, the select() system call will work with as many file descriptors as you care to pass to it. You can see that both glibc [2] and the kernel [3] support arbitrarily large arrays.
[1] https://github.com/bminor/glibc/blob/master/misc/sys/select....
[2] https://github.com/bminor/glibc/blob/master/sysdeps/unix/sys...
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
-
How are threads created in Linux x86_64
The source code for that is here.
-
Using Uninitialized Memory for Fun and Profit (2008)
Expanding macro gives three GCC function attributes [2]: `__attribute__ ((malloc))`, `__attribute__ ((alloc_size(1)))` and `__attribute__ ((warn_unused_result))`. They are required for GCC (and others recognizing them) to actually ensure that they behave as the standard dictates. Your own malloc-like functions won't be treated same unless you give similar attributes.
[1] https://github.com/bminor/glibc/blob/807690610916df8aef17cd1...
[2] https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attribute...
-
“csinc”, the AArch64 instruction you didn’t know you wanted
IFunc relocations is what enables glibc to dynamically choose the best memcpy routine to use at runtime based on the CPU.
see https://github.com/bminor/glibc/blob/glibc-2.31/sysdeps/x86_...
-
memmove() implementation in strictly conforming C -- possible?
memmove can be very well implemented in pure C, libc implementations usually have a "generic" (meaning, architecture independent) fallback. Here is musl generic implementation and its x86-64 assembly implementation. For glibc, implementation is a bit more complex, having multiple architectures implemented, but you could find a generic implementation with these two files: memmove.c and generic/memcopy.h.
-
Fedora 38 LLVM vs. Team Fortress 2
Yeah, looks like the Q_strcat(pszContentPath, "/"); is invalid, as glibc has only allocated exactly enough to fit the path in the buffer returned by realpath().
Interestingly, the open group spec says that a null argument to realpath is "Implementation defined" [0]
And the linux (glibc) man pages say it allocates a buffer "Up to PATH_MAX" [1]
I guess "strlen(path)" is "Up to PATH_MAX", but the man page seems unclear - you could read that as implying the buffer is always allocated to PATH_MAX size, but that's not what seems to be happening, just effectively calling strdup() [2]. I have no idea how to feed back to the linux man pages, but might be worth clarifying there.
[0] https://pubs.opengroup.org/onlinepubs/009696799/functions/re...
[1] https://linux.die.net/man/3/realpath
[2] https://github.com/bminor/glibc/blob/0b9d2d4a76508fdcbd9f421...
-
Method implementations
For the actual sources you will have to look at one of the mirrors of the C standard library, such as https://github.com/bminor/glibc/tree/master/sysdeps/ieee754/dbl-64
ocaml
-
Autoconf makes me think we stopped evolving too soon
> OCaml’s configure script is also “normal”
If that’s this OCaml, it has a configure.ac file in the root directory, which looks suspicious for an Autotools-free package: https://github.com/ocaml/ocaml
-
The Return of the Frame Pointers
You probably already know, but with OCaml 5 the only way to get flamegraphs working is to either:
* use framepointers [1]
* use LBR (but LBR has a limited depth, and may not work on on all CPUs, I'm assuming due to bugs in perf)
* implement some deep changes in how perf works to handle the 2 stacks in OCaml (I don't even know if this would be possible), or write/adapt some eBPF code to do it
OCaml 5 has a separate stack for OCaml code and C code, and although GDB can link them based on DWARF info, perf DWARF call-graphs cannot (https://github.com/ocaml/ocaml/issues/12563#issuecomment-193...)
If you need more evidence to keep it enabled in future releases, you can use OCaml 5 as an example (unfortunately there aren't many OCaml applications, so that may not carry too much weight on its own).
[1]: I haven't actually realised that Fedora39 has already enabled FP by default, nice! (I still do most of my day-to-day profiling on an ~CentOS 7 system with 'perf --call-graph dwarf', I was aware that there was a discussion to enable FP by default, but haven't noticed it has actually been done already)
-
Top Paying Programming Technologies 2024
11. OCaml - $91,026
-
OCaml: a Rust developer's first impressions
> It partially helps since it forces you to have types where they matters most: exported functions
But the problém the OP has is not knowing the types when reading the source (in the .ml file).
> How would it feels like to use list if only https://github.com/ocaml/ocaml/blob/trunk/stdlib/list.ml was available,
If the signature where in the source file (which you can do in OCaml too), there would be no problem - which is what all the other (for some definition of "other") languages except C and C++ (even Fortran) do.
No, really, I can't see a single advantage of separate .mli files at all. The real problém is that the documentation is often worse too, as the .mli is autogenerated and documented afterwards - and now changes made later in the sources need to be documented in the mli too, so anything that doesn't change the type often gets lost. The same happens in C and C++ with header files.
-
Bringing more sweetness to ruby with sorbet types 🍦
If you have been in the Ruby community for the past couple of years, it's possible that you're not a super fan of types or that this concept never passed through your mind, and that's totally cool. I myself love the dynamic and meta-programming nature of Ruby, and honestly, by the time of this article's writing, we aren't on the level of OCaml for type checking and inference, but still, there are a couple of nice things that types with sorbet bring to the table:
-
What is gained and lost with 63-bit integers? (2014)
Looks like there have been proposals to eliminate use of 3 operand lea in OCaml code (not accepted sadly):
https://github.com/ocaml/ocaml/pull/8531
-
Notes about the ongoing Perl logo discussion
An amazing example is Ocaml lang logo / mascot. It might be useful to talk with them to know what was the process behind this work. The About page camel head on Perl dot org header is also a pretty good example of simplification, but it's not a logo, just a friendly illustration, as the O'Reilly camel is. Another notable logo for this animal is the well known tobacco industry company, but don't get me started on that (“good” logo, though, if we look at the effectiveness of their marketing).
-
What can Category Theory do?
Haskell and Agda are probably the most obvious examples. Ocaml too, but it is much older, so its type system is not as categorical. There is also Idris, which is not as well-known but is very cool.
- Playing Atari Games in OCaml
-
Bloat
That does sound problematic, but without the code it is hard to tell what is the issue. Typically, compiling a 6kLoc file like https://github.com/ocaml/ocaml/blob/trunk/typing/typecore.ml takes 0.8 s on my machine.
What are some alternatives?
musl - Unofficial mirror of etalabs musl repository. Updated daily.
Alpaca-API - The Alpaca API is a developer interface for trading operations and market data reception through the Alpaca platform.
cosmopolitan - build-once run-anywhere c library
VisualFSharp - The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
dns - DNS library in Go
dune - A composable build system for OCaml.
0.30000000000000004 - Floating Point Math Examples
TradeAlgo - Stock trading algorithm written in Python for TD Ameritrade.
json-c - https://github.com/json-c/json-c is the official code repository for json-c. See the wiki for release tarballs for download. API docs at http://json-c.github.io/json-c/
melange - A mixture of tooling combined to produce JavaScript from OCaml & Reason
degasolv - Democratize dependency management.
rust - Rust for the xtensa architecture. Built in targets for the ESP32 and ESP8266