jemalloc VS tcmalloc

Compare jemalloc vs tcmalloc and see what are their differences.

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
jemalloc tcmalloc
34 15
9,046 4,069
0.8% 1.2%
8.3 9.8
15 days ago 7 days ago
C C++
GNU General Public License v3.0 or later Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

jemalloc

Posts with mentions or reviews of jemalloc. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-09.
  • Show HN: Comprehensive inter-process communication (IPC) toolkit in modern C++
    2 projects | news.ycombinator.com | 9 Apr 2024
    - Split-up a certain important C++ service into several parts, for various reasons, without adding latency to the request path.

    The latter task meant, among other things, communicating large amounts of user data from server application to server application. capnp-encoded structures (sometimes big - but not necessarily) would also need to be transmitted; as would FDs.

    The technical answers to these challenges are not necessarily rocket science. FDs can be transmitted via Unix domain socket as "ancillary data"; the POSIX `sendmsg()` API is hairy but usable. Small messages can be transmitted via Unix domain socket, or pipe, or POSIX MQ (etc.). Large blobs of data it would not be okay to transmit via those transports, as too much copying into and out of kernel buffers is involved and would add major latency, so we'd have to use shared memory (SHM). Certainly a hairy technology... but again, doable. And as for capnp - well - you "just" code a `MessageBuilder` implementation that allocates segments in SHM instead of regular heap like `capnp::MallocMessageBuilder` does.

    Thing is, I noticed that various parts of the company had similar needs. I've observed some variation of each of the aforementioned tasks custom-implemented - again, and again, and again. None of these implementations could really be reused anywhere else. Most of them ran into the same problems - none of which is that big a deal on its own, but together (and across projects) it more than adds up. To coders it's annoying. And to the business, it's expensive!

    Plus, at least one thing actually proved to be technically quite hard. Sharing (via SHM) a native C++ structure involving STL containers and/or raw pointers: downright tough to achieve in a general way. At least with Boost.interprocess (https://www.boost.org/doc/libs/1_84_0/doc/html/interprocess....) - which is really quite thoughtful - one can accomplish a lot... but even then, there are key limitations, in terms of safety and ease of use/reusability. (I'm being a bit vague here... trying to keep the length under control.)

    So, I decided to not just design/code an "IPC thing" for that original key C++ service I was being asked to split... but rather one that could be used as a general toolkit, for any C++ applications. Originally we named it Akamai-IPC, then renamed it Flow-IPC.

    As a result of that origin story, Flow-IPC is... hmmm... meat-and-potatoes, pragmatic. It is not a "framework." It does not replace or compete with gRPC. (It can, instead, speed RPC frameworks up by providing the zero-copy transmission substrate.) I hope that it is neither niche nor high-maintenance.

    To wit: If you merely want to send some binary-blob messages and/or FDs, it'll do that - and make it easier by letting you set-up a single session between the 2 processes, instead of making you worry about socket names and cleanup. (But, that's optional! If you simply want to set up a Unix domain socket yourself, you can.) If you want to add structured messaging, it supports Cap'n Proto - as noted - and right out of the box it'll be zero-copy end-to-end. That is, it'll do all the SHM stuff without a single `shm_open()` or `mmap()` or `ftruncate()` on your part. And if you want to customize how that all works, those layers and concepts are formally available to you. (No need to modify Flow-IPC yourself: just implement certain concepts and plug them in, at compile-time.)

    Lastly, for those who want to work with native C++ data directly in SHM, it'll simplify setup/cleanup considerably compared to what's typical. For the original Akamai service in question, we needed to use SHM as intensively as one typically uses the regular heap. So in particular Boost.interprocess's built-in 2 SHM-allocation algorithms were not sufficient. We needed something more industrial-strength. So we adapted jemalloc (https://jemalloc.net/) to work in SHM, and worked that into Flow-IPC as a standard available feature. (jemalloc powers FreeBSD and big parts of Meta.) So jemalloc's anti-fragmentation algorithms, thread caching - all that stuff - will work for our SHM allocations.

    Having accepted this basic plan - develop a reusable IPC library that handled the above oft-repeated needs - Eddy Chan joined and especially heavily contributed on the jemalloc aspects. A couple years later we had it ready for internal Akamai use. All throughout we kept it general - not Akamai-specific (and certainly not specific to that original C++ service that started it all off) - and personally I felt it was a very natural candidate for open-source.

    To my delight, once I announced it internally, the immediate reaction from higher-up was, "you should open-source it." Not only that, we were given the resources and goodwill to actually do it. I have learned that it's not easy to make something like this presentable publicly, even having developed it with that in mind. (BTW it is about 69k lines of code, 92k lines of comments, excluding the Manual.)

    So, that's what happened. We wrote a thing useful for various teams internally at Akamai - and then Akamai decided we should share it with the world. That's how open-source thrives, we figured.

    On a personal level, of course it would be gratifying if others found it useful and/or themselves contributed. What a cool feeling that would be! After working with exemplary open-source stuff like capnp, it'd be amazing to offer even a fraction of that usefulness. But, we don't gain from "market share." It really is just there to be useful. So we hope it is!

  • Finding memory leaks in Postgres C code
    1 project | news.ycombinator.com | 29 Mar 2024
    jemalloc as well has some handy leak / memory profiling abilities: https://github.com/jemalloc/jemalloc/wiki/Use-Case%3A-Heap-P...
  • Speed of Rust vs. C
    2 projects | news.ycombinator.com | 23 Feb 2024
    The worst memory performance bug I ever saw turned out to be heap fragmentation in a non-GC system. There are memory allocators that solve this like https://github.com/jemalloc/jemalloc/tree/dev but ... they do it by effectively running a GC at the block level

    As soon as you use atomic counters in a multi-threaded system you can wave goodbye to your scalability too!

  • Understanding Mesh Allocator
    2 projects | news.ycombinator.com | 26 Jan 2024
    The linked talk video mentioned they're playing with it in jemalloc and tcmalloc.

    I found this https://github.com/jemalloc/jemalloc/issues/1440 but couldn't find tcmalloc doing similar.

    These guys are aware of mesh and compare against it: https://abelay.github.io/6828seminar/papers/maas:llama.pdf

  • Atomics and Concurrency
    3 projects | news.ycombinator.com | 12 Jan 2024
    I think that the point rather was not to use any allocation in critical sections since allocator implementations are not lock-free or wait-free.

    https://github.com/jemalloc/jemalloc/blob/dev/src/mutex.c

  • Rust std:fs slower than Python
    7 projects | news.ycombinator.com | 29 Nov 2023
    Be aware `jemalloc` will make you suffer the observability issues of `MADV_FREE`. `htop` will no longer show the truth about how much memory is in use.

    * https://github.com/jemalloc/jemalloc/issues/387#issuecomment...

    * https://gitlab.haskell.org/ghc/ghc/-/issues/17411

    Apparently now `jemalloc` will call `MADV_DONTNEED` 10 seconds after `MADV_FREE`:

  • How does the OS know how much virtual memory is needed?
    1 project | /r/C_Programming | 1 Jul 2023
    jemalloc (the default FreeBSD malloc, also used by Rust) http://jemalloc.net/
  • The Overflowing Timeout Error - A Debugging Journey in Memgraph!
    1 project | dev.to | 15 Mar 2023
    Of course, we are not working on one feature at a time, we're doing things in parallel. While working on the timers, we introduced jemalloc into our codebase. After merging the jemalloc changes, tests for the timers started to fail. And what kind of failure? Segmentation faults, of course, what else...
  • Google's OSS-Fuzz expands fuzz-reward program to $30000
    3 projects | news.ycombinator.com | 2 Feb 2023
    https://github.com/jemalloc/jemalloc/issues/2222

    Strangely, these bugs were found by the CI of ClickHouse, and not by any of the hundreds of other products using these libraries.

  • My app stop working
    1 project | /r/docker | 30 Jan 2023
    2- WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can can also cause failures without low memory condition, see https://github.com/jemalloc/jemalloc/issues/1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

tcmalloc

Posts with mentions or reviews of tcmalloc. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-06.
  • Configuring HugePages on Google's TCMalloc
    1 project | /r/cpp_questions | 25 Jun 2023
    https://github.com/google/tcmalloc/issues/190
  • Configuring HugePages on TCMalloc
    1 project | /r/cpp | 24 Jun 2023
    I had earlier raised a query on the github.com/google/tcmalloc regarding how I can force tcmalloc to back memory with hugetlbfs instead of using Transparent Huge Pages. I have attached the link to my query below. Please let me know if there is an possible way to do this.
  • New memory related fields in Yugabyte 2.17.3 pg_stat_activity
    1 project | dev.to | 8 May 2023
    The allocated_mem_bytes field shows the memory allocated by the memory allocator. PostgreSQL is setup in an extensible way, which includes the ability to choose a memory allocator, which for PostgreSQL is ptmalloc, and for YSQL is tcmalloc. PostgreSQL has the ability to change the memory allocator, but by default uses the operating system memory allocator.
  • Spotting and Avoiding Heap Fragmentation in Rust Applications
    3 projects | news.ycombinator.com | 6 Apr 2023
    > * Switching from libc malloc to tcmalloc (dating myself a little bit)

    If you think of tcmalloc as an old crusty allocator, you've probably only seen the gperftools version of it.

    This is the version Google now uses internally: https://github.com/google/tcmalloc

    It's worth a fresh look. In particular, it supports per-CPU caches as an alternative to per-thread caches. Those are fantastic if you have a lot more threads than CPUs.

  • I've had bad luck with transparent hugepages on my Linux machines
    1 project | news.ycombinator.com | 1 Feb 2023
    The default setting of max_ptes_none is also problematic.

    On a stock kernel, it's 511. TCMalloc's docs recommend using max_ptes_none set to 0 for this reason: https://github.com/google/tcmalloc/blob/master/docs/tuning.m...

    (Disclosure: I work on TCMalloc and authored the above doc.)

  • Pages Are a Good Idea
    1 project | news.ycombinator.com | 22 Jan 2023
    The easiest way to exploit THP, by far, is to link your program against TCMalloc and forget about it. Literally free money. Highly recommended.

    https://github.com/google/tcmalloc

  • Why tcmalloc using aggresive decommit == false is a litte better than jemalloc
    1 project | news.ycombinator.com | 27 Sep 2022
  • System memory allocator free operation zeroes out deallocated blocks in iOS 16
    4 projects | news.ycombinator.com | 22 Sep 2022
  • malloc() and free() are a bad API
    2 projects | /r/C_Programming | 31 Aug 2022
    This means that efficient malloc implementation is typically overly complicated. mimalloc for example is almost 8K lines of C afaik, which is one of the smaller but still efficient malloc implementation I'm aware of. (Try looking into tcmalloc for comparison).
  • malloc global mutex?
    2 projects | /r/cpp_questions | 22 Jun 2022
    Yes, it is synchronized, you can also swap out the implementation typically. There are different allocators out there depending on what you are trying to optimize for (memory, single thread performance, multithread performance, locality, etc). A lot of multithreading optimized ones use per thread pools, so each individual allocation doesn't need to globally lock, but changing the pools themself does, or large allocations that aren't part of the pools. For example https://github.com/google/tcmalloc

What are some alternatives?

When comparing jemalloc and tcmalloc you can also consider the following projects:

mimalloc - mimalloc is a compact general purpose allocator with excellent performance.

image-spec - OCI Image Format

tbb - oneAPI Threading Building Blocks (oneTBB) [Moved to: https://github.com/oneapi-src/oneTBB]

tinyrenderer - A brief computer graphics / rendering course

rust-scudo

dlmalloc - Doug Lea's memory allocator

rpmalloc - Public domain cross platform lock free thread caching 16-byte aligned memory allocator implemented in C

Hoard - The Hoard Memory Allocator: A Fast, Scalable, and Memory-efficient Malloc for Linux, Windows, and Mac.

glibc - Unofficial mirror of sourceware glibc repository. Updated daily.

gperftools - Main gperftools repository

compiler-rt - Project moved to: https://github.com/llvm/llvm-project