Kafka

Open-source projects categorized as Kafka

Top 23 Kafka Open-Source Projects

  1. pathway

    Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

    Project mention: GitHub's Fake Star Economy | news.ycombinator.com | 2026-04-20
  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. data-engineering-zoomcamp

    Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

    Project mention: From APIs to Warehouses: AI-Assisted Data Ingestion with dlt | dev.to | 2026-03-01

    You can sign up here: https://github.com/DataTalksClub/data-engineering-zoomcamp/

  4. Apache Kafka

    Apache Kafka - A distributed event streaming platform

    Project mention: Building Kafka Producer-Consumer Using Go and Docker | dev.to | 2026-06-08

    Kafka is a distributed streaming platform used to build real-time data pipelines and streaming applications. It allows producers to send messages to topics, which are then consumed by various consumers, making it ideal for event-driven architectures.

  5. SpringBoot-Labs

    一个涵盖六个专栏:Spring Boot 2.X、Spring Cloud、Spring Cloud Alibaba、Dubbo、分布式消息队列、分布式事务的仓库。希望胖友小手一抖,右上角来个 Star,感恩 1024

  6. Telegraf

    Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.

    Project mention: The Performance Battle the hardening of Vault and OWASP: What Matters | dev.to | 2026-05-07

    A common mistake we see is teams rejecting Vault hardening because they assume it will add 400ms+ p99 latency, a myth perpetuated by outdated blog posts from 2020. Our benchmarks of Vault 1.15 show that OWASP-compliant TLS 1.3 and rate limiting add only 12ms p99 latency, not 400ms. Before deciding against hardening, run the first code example (vault_owasp_benchmark.py) against your own Vault deployment to get real numbers for your workload. Use tools like https://github.com/influxdata/telegraf to collect latency metrics from production, and https://github.com/grafana/grafana to visualize percentiles over time. For read-heavy workloads (10k+ requests per second), the latency overhead is even lower (under 8ms p99) because TLS 1.3 has lower handshake overhead than TLS 1.2. We worked with a SaaS company that rejected hardening for 6 months due to latency fears, only to find after benchmarking that the overhead was 9ms p99, well within their 100ms SLA. They implemented hardening in 2 weeks, cut breach risk by 89%, and saw no customer impact. Always benchmark with your own traffic patterns, not generic numbers from the internet. Use the Go benchmark tool or Python script provided to test with your secret sizes, request patterns, and auth methods. If you see unexpected latency spikes, check for misconfigured cipher suites or rate limits that are too strict, not the hardening itself.

  7. C++ Workflow

    C++ Parallel Computing and Asynchronous Networking Framework

  8. debezium

    Change data capture for a variety of databases. Please log issues at https://github.com/debezium/dbz/issues.

    Project mention: 7 Free Tools for Data Pipeline Reconciliation and Cross-Source Validation | dev.to | 2026-05-13

    Debezium is an open-source CDC (change data capture) platform that streams database changes - inserts, updates, deletes - from supported databases (PostgreSQL, MySQL, MongoDB, SQL Server) to downstream consumers via Apache Kafka.

  9. sarama

    Sarama is a Go library for Apache Kafka.

    Project mention: Kafka Consumer Health Checks: Dead or Alive | dev.to | 2025-12-13

    I’ve packaged this logic into kafka-pulse-go, a lightweight library that works with most popular Kafka clients. The core logic is decoupled from specific client implementations, with adapters provided for Sarama, segmentio/kafka-go, and Confluent’s client.

  10. redpanda

    Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!

    Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10

    Redpanda

  11. kafka-ui

    Open-Source Web UI for Apache Kafka Management

    Project mention: Kafka UI in Action: Monitoring and Managing Kafka Like a Pro | dev.to | 2025-06-29

    GitHub: https://github.com/provectus/kafka-ui

  12. kubeshark

    eBPF-powered network observability for Kubernetes. Indexes L4/L7 traffic with full K8s context, decrypts TLS without keys. Queryable by AI agents via MCP and humans via dashboard.

    Project mention: API Traffic Analyzer for Kubernetes | news.ycombinator.com | 2026-03-09
  13. kafka-manager

    CMAK is a tool for managing Apache Kafka clusters

  14. automq

    Diskless Kafka® on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.

    Project mention: Top 10 noteworthy Java errors in 2025 | dev.to | 2025-12-24

    Let's move on to the error from AutoMQ:

  15. watermill

    Building event-driven applications the easy way in Go.

    Project mention: How I built Upple: A modern uptime monitor with Go and React | dev.to | 2026-01-02

    I'm using Watermill for the event bus with Redis Streams as the backend. Redis Streams has this concept of consumer groups; consumers in the same group split messages between them, while different groups each receive all messages.

  16. risingwave

    Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.

    Project mention: Building a Real-Time Crypto Arbitrage Monitoring System | dev.to | 2025-11-24

    In crypto markets, these price differences, or spreads, appear and vanish in milliseconds. If your data pipeline takes five seconds to process a batch of prices, the opportunity is already gone. This post demonstrates how to use RisingWave—an open-source real-time event streaming platform—to detect arbitrage opportunities with sub-second latency using standard SQL.

  17. connect

    Fancy stream processing made operationally mundane (by redpanda-data)

  18. kafka-go

    Kafka library in Go

    Project mention: Opinion: Why You Should Use NATS 2.10 Over Kafka for Edge Messaging | dev.to | 2026-04-28
  19. DevOps-Bash-tools

    1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, tmux..

    Project mention: Level Up Your DevOps Workflow with Hari Sekhon's Bash Tools! | dev.to | 2025-06-23

    View the Project on GitHub

  20. graylog

    Free and open log management

  21. CAP

    Distributed transaction solution in micro-service base on eventually consistency, also an eventbus with Outbox pattern (by dotnetcore)

  22. Faust

    Python Stream Processing

  23. materialize

    The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL (by MaterializeInc)

    Project mention: ANN v3: 200ms p99 query latency over 100B vectors | news.ycombinator.com | 2026-01-25

    I agree our sample may not be representative but we try to stay focused on the current and next crop of tpuf customers. So far "CI prohibits network access during tests" just hasn't come up as a pain point for any of them, but as I mentioned in another comment [0], we're definitely keeping an open mind about introducing an offline dev experience.

    At my last company an engineer spent a year implementing Bazel [0][1] only to have it ripped out after they left [2] due to the maintenance burden. You might say it was a little bit of a hassle. :)

    [0]: https://news.ycombinator.com/item?id=46758156

    [1]: https://github.com/MaterializeInc/materialize/pull/24243

    [2]: https://github.com/MaterializeInc/materialize/pull/31006

    [3]: https://github.com/MaterializeInc/materialize/pull/33895

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Kafka discussion

Log in or Post with

Kafka related posts

  • Cursor made me do it

    3 projects | dev.to | 4 Jun 2026
  • Made a Tool to Streams Changes from Microsoft SQL Server to Apache Kafka

    2 projects | news.ycombinator.com | 31 May 2026
  • FastStream 0.7: MQTT support – in-memory tests, AsyncAPI generation and more

    1 project | news.ycombinator.com | 1 Jun 2026
  • Usage-Based Billing for AI Agents with FastAPI and Kong

    2 projects | dev.to | 26 May 2026
  • Show HN: Diom – Back end primitives (queue, rate limit, etc.) in one Rust binary

    1 project | news.ycombinator.com | 20 May 2026
  • Show HN: Diom – Open-source back end primitives with no runtime dependencies

    1 project | news.ycombinator.com | 13 May 2026
  • 7 Free Tools for Data Pipeline Reconciliation and Cross-Source Validation

    4 projects | dev.to | 13 May 2026
  • A note from our sponsor - SaaSHub
    www.saashub.com | 13 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Kafka projects? This list will help you:

# Project Stars
1 pathway 63,006
2 data-engineering-zoomcamp 42,355
3 Apache Kafka 32,807
4 SpringBoot-Labs 20,089
5 Telegraf 17,619
6 C++ Workflow 14,369
7 debezium 12,813
8 sarama 12,493
9 redpanda 12,201
10 kafka-ui 12,055
11 kubeshark 11,949
12 kafka-manager 11,938
13 automq 10,005
14 watermill 9,750
15 risingwave 9,077
16 connect 8,678
17 kafka-go 8,570
18 DevOps-Bash-tools 8,295
19 graylog 8,061
20 CAP 7,088
21 Faust 6,823
22 flink-cdc 6,432
23 materialize 6,316

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Java is
the 10th most popular programming language
based on number of references?