Experience with heap bloat

This page summarizes the projects mentioned and recommended in the original post on /r/rust

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • loadtxt

    ~60-300x faster than numpy.loadtxt

  • Amdahl's Law will catch up with you really fast as you add threads with this strategy, but it's simple and is amenable to formats where you may have a delimiter in the middle of a record. For situations where you need maximum scaling and don't have the possibility of delimiters scattered into records, you can use the strategy I used to implement a faster numpy.loadtxt: https://github.com/saethlin/loadtxt/blob/master/src/inner.rs#L84 The general idea is that you divide the file among thread boundaries by splitting it on byte boundaries, then seeking from that byte offset to the end of the next record. This gets you non-interleaved sections so there's no duplicate parsing.

  • polars

    Dataframes powered by a multithreaded, vectorized query engine, written in Rust

  • I don't use arrows csv parser. This is the code I am talking of https://github.com/ritchie46/polars/blob/master/polars/polars-io/src/fork/csv.rs

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • jemalloc

  • It looks like jemalloc will use madvise where appropriate to tell the OS it doesn't need pages resident it memory. Ctrl-f MADV_DONTNEED: https://github.com/jemalloc/jemalloc/blob/a943172b732e65da34a19469f31cd3ec70cf05b0/src/pages.c

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Why Python's Integer Division Floors (2010)

    1 project | news.ycombinator.com | 28 Feb 2024
  • Polars 0.20 Released

    1 project | news.ycombinator.com | 16 Dec 2023
  • Polars: Dataframes powered by a multithreaded query engine, written in Rust

    1 project | news.ycombinator.com | 7 Dec 2023
  • Polars 0.34 is released. (A query engine focussing on DataFrame front ends)

    1 project | /r/u_Dazzling_Finger_8120 | 26 Oct 2023
  • Polars 0.34 is released. (A query engine focussing on DataFrame front ends)

    1 project | /r/rust | 26 Oct 2023