Data Ingestion - Build Your Own "Map Reduce"?

This page summarizes the projects mentioned and recommended in the original post on dev.to

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. mrjob

    Run MapReduce jobs on Hadoop or Amazon Web Services

    Mrjob

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. mmh3

    Python extension for MurmurHash (MurmurHash3), a set of fast and robust hash functions.

    Some notes: We don't need Sha256 and not evey base64; nothing will happen if keys will not distribute very equally. we could take MMH3; googling "python murmurhash" gives 2 interesting results; and since both use the same cpp code, let's take the one with most stars Other options would be to simply do (% NUM_SHARDS) or even shift right (however must have shards count == power of 2).

  4. murmurhash

    šŸ’„ Cython bindings for MurmurHash2 (by explosion)

    Some notes: We don't need Sha256 and not evey base64; nothing will happen if keys will not distribute very equally. we could take MMH3; googling "python murmurhash" gives 2 interesting results; and since both use the same cpp code, let's take the one with most stars Other options would be to simply do (% NUM_SHARDS) or even shift right (however must have shards count == power of 2).

  5. py-spy

    Sampling profiler for Python programs

    Q: Are we sure about it? A: we could use py-spy and see.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Minha jornada de otimizaĆ§Ć£o de uma aplicaĆ§Ć£o django

    5 projects | dev.to | 13 Mar 2024
  • Graphical Python Profiler

    4 projects | news.ycombinator.com | 5 Jul 2023
  • Has anyone switched from numpy to Rust?

    1 project | /r/rust | 11 Mar 2023
  • Tales of serving ML models with low-latency

    1 project | /r/mlops | 4 Dec 2022
  • Profiling a Python library written in Rust (Maturin)

    2 projects | /r/learnrust | 25 Oct 2022

Did you know that C++ is
the 7th most popular programming language
based on number of references?