Choosing a stream processor: Kafka Streaming vs Flink vs Spark Streaming vs Storm vs Samza?

This page summarizes the projects mentioned and recommended in the original post on reddit.com/r/dataengineering

Our great sponsors
  • Scout APM - Less time debugging, more time building
  • SonarQube - Static code analysis for 29 languages.
  • SaaSHub - Software Alternatives and Reviews
  • Streamz

    Real-time stream processing for python

    I use https://github.com/python-streamz/streamz + Dask for 100% python distributed mini batch real time processing, so we can import any python libraries and less hustle to deploy the server in production. We processed average 120 GB everyday, CDC from Debezium dan Kafka Connect Oracle Big Data Golden Gate.

  • streaming-ops

    Simulated production environment running Kubernetes targeting Apache Kafka and Confluent components on Confluent Cloud. Managed by declarative infrastructure and GitOps.

    We have 100+ python scripts run using kubernetes deployment in GKE, CICD using “streaming-ops”, https://github.com/confluentinc/streaming-ops, only reload changed scripts when pushed to master.

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts