Data Parallel Pipeline/MapReduce in C++?

This page summarizes the projects mentioned and recommended in the original post on /r/cpp

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • mr4c

  • The big cloud platforms all have frameworks for data parallel pipelines, but (I think?) not in C++. I've looked at Hadoop pipes and MR4C. Both have advantages/disadvantages. Any other suggestions?

  • cylon

    Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame. (by cylondata)

  • There's also https://cylondata.org/ which is more of a Pandas approach.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts