For processing strings, streams in C++ can be slow

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • C++ Format

    A modern formatting library

  • are currently a good example of pure product of the 9Xs/2000s when the hype about Object Oriented was ongoing.

    Almost everything related to it has this OO code smell:

    - Usage of virtual runtime dispatch with virtual calls when it is not necessary. Causing a negative impact on performance: a shame for C++.

    - Heavy usage of function overloading with the "<<" operator. Leading to pages long compilation errors when an overload fails.

    - Hidden states everywhere with the usage of state formatters and globals in the background.

    - Unnecessary complexity with std::locale which is almost entirely useless for proper internationalisation.

    - Useless encapsulation with error reports done as abstracted bit flags. Which is absolutely horrendous when dealing with file I/O: It hides away the underlying error with no proper way to access it.

    - Deep class hierarchy making the entire thing looks like spaghetti.

    - Useless abstraction with stringstream that hides the underlying buffer away, making it close to unusable on safety critical systems.

    All of that made aged pretty badly, and for good reasons.

    Fortunately there is an incoming way out of that with work of Viktor Zverovich on std::format and libfmt [1].

    [1]: https://github.com/fmtlib/fmt

  • pystring

    C++ functions matching the interface and behavior of python string methods with std::string

  • I like C much more than C++, but even I must say that https://github.com/fmtlib/fmt is pretty nice (which is the base for std::format). Together with pystring (https://github.com/imageworks/pystring) it makes string processing in C++ somewhat bearable (still slow though because pystring is based on std::string and excessively allocates, but at least convenient).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • node

    Node.js JavaScript runtime ✨🐢🚀✨

  • The github issue:

    https://github.com/nodejs/node/pull/50253

    Note that the person mixing java knowledge and C++ isn't Daniel Lemire.

  • closure-library

    Google's common JavaScript library

  • > "I recently learned that some Node.js engineers prefer stream classes when building strings, for performance reasons." Pretty much tells you everything you need to know about node js, I guess.

    Google Closure Library includes a StringBuffer class. [1]

    I recall it having explanatory notes, but I don't see them in the code now. JavaScript engines can optimize a string concatenating to in-place edit, if there is only one reference to the first string. The StringBuffer class keeps the reference count at one, guaranteeing this optimization is available, even if the StringBuffer itself is ever shared.

    [1] https://github.com/google/closure-library/blob/master/closur...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts