SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Kafka Open-Source Projects
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
data-engineering-zoomcamp
Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼
You can sign up here: https://github.com/DataTalksClub/data-engineering-zoomcamp/
-
Kafka is a distributed streaming platform used to build real-time data pipelines and streaming applications. It allows producers to send messages to topics, which are then consumed by various consumers, making it ideal for event-driven architectures.
-
SpringBoot-Labs
一个涵盖六个专栏:Spring Boot 2.X、Spring Cloud、Spring Cloud Alibaba、Dubbo、分布式消息队列、分布式事务的仓库。希望胖友小手一抖,右上角来个 Star,感恩 1024
-
Telegraf
Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
Project mention: The Performance Battle the hardening of Vault and OWASP: What Matters | dev.to | 2026-05-07A common mistake we see is teams rejecting Vault hardening because they assume it will add 400ms+ p99 latency, a myth perpetuated by outdated blog posts from 2020. Our benchmarks of Vault 1.15 show that OWASP-compliant TLS 1.3 and rate limiting add only 12ms p99 latency, not 400ms. Before deciding against hardening, run the first code example (vault_owasp_benchmark.py) against your own Vault deployment to get real numbers for your workload. Use tools like https://github.com/influxdata/telegraf to collect latency metrics from production, and https://github.com/grafana/grafana to visualize percentiles over time. For read-heavy workloads (10k+ requests per second), the latency overhead is even lower (under 8ms p99) because TLS 1.3 has lower handshake overhead than TLS 1.2. We worked with a SaaS company that rejected hardening for 6 months due to latency fears, only to find after benchmarking that the overhead was 9ms p99, well within their 100ms SLA. They implemented hardening in 2 weeks, cut breach risk by 89%, and saw no customer impact. Always benchmark with your own traffic patterns, not generic numbers from the internet. Use the Go benchmark tool or Python script provided to test with your secret sizes, request patterns, and auth methods. If you see unexpected latency spikes, check for misconfigured cipher suites or rate limits that are too strict, not the hardening itself.
-
-
debezium
Change data capture for a variety of databases. Please log issues at https://github.com/debezium/dbz/issues.
Project mention: 7 Free Tools for Data Pipeline Reconciliation and Cross-Source Validation | dev.to | 2026-05-13Debezium is an open-source CDC (change data capture) platform that streams database changes - inserts, updates, deletes - from supported databases (PostgreSQL, MySQL, MongoDB, SQL Server) to downstream consumers via Apache Kafka.
-
I’ve packaged this logic into kafka-pulse-go, a lightweight library that works with most popular Kafka clients. The core logic is decoupled from specific client implementations, with adapters provided for Sarama, segmentio/kafka-go, and Confluent’s client.
-
redpanda
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10Redpanda
-
GitHub: https://github.com/provectus/kafka-ui
-
kubeshark
eBPF-powered network observability for Kubernetes. Indexes L4/L7 traffic with full K8s context, decrypts TLS without keys. Queryable by AI agents via MCP and humans via dashboard.
-
-
automq
Diskless Kafka® on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.
Let's move on to the error from AutoMQ:
-
I'm using Watermill for the event bus with Redis Streams as the backend. Redis Streams has this concept of consumer groups; consumers in the same group split messages between them, while different groups each receive all messages.
-
risingwave
Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.
In crypto markets, these price differences, or spreads, appear and vanish in milliseconds. If your data pipeline takes five seconds to process a batch of prices, the opportunity is already gone. This post demonstrates how to use RisingWave—an open-source real-time event streaming platform—to detect arbitrage opportunities with sub-second latency using standard SQL.
-
-
Project mention: Opinion: Why You Should Use NATS 2.10 Over Kafka for Edge Messaging | dev.to | 2026-04-28
-
DevOps-Bash-tools
1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, tmux..
View the Project on GitHub
-
-
CAP
Distributed transaction solution in micro-service base on eventually consistency, also an eventbus with Outbox pattern (by dotnetcore)
-
-
Project mention: Flink CDC from the Trenches: Handling JSON in Pipelines with UDFs | dev.to | 2025-12-16
A critically important class that Flink CDC ships to perform type conversions internally is the DataTypeConverter, which you can find the source code for right here.
-
materialize
The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL (by MaterializeInc)
Project mention: ANN v3: 200ms p99 query latency over 100B vectors | news.ycombinator.com | 2026-01-25I agree our sample may not be representative but we try to stay focused on the current and next crop of tpuf customers. So far "CI prohibits network access during tests" just hasn't come up as a pain point for any of them, but as I mentioned in another comment [0], we're definitely keeping an open mind about introducing an offline dev experience.
At my last company an engineer spent a year implementing Bazel [0][1] only to have it ripped out after they left [2] due to the maintenance burden. You might say it was a little bit of a hassle. :)
[0]: https://news.ycombinator.com/item?id=46758156
[1]: https://github.com/MaterializeInc/materialize/pull/24243
[2]: https://github.com/MaterializeInc/materialize/pull/31006
[3]: https://github.com/MaterializeInc/materialize/pull/33895
Kafka discussion
Kafka related posts
-
Cursor made me do it
-
Made a Tool to Streams Changes from Microsoft SQL Server to Apache Kafka
-
FastStream 0.7: MQTT support – in-memory tests, AsyncAPI generation and more
-
Usage-Based Billing for AI Agents with FastAPI and Kong
-
Show HN: Diom – Back end primitives (queue, rate limit, etc.) in one Rust binary
-
Show HN: Diom – Open-source back end primitives with no runtime dependencies
-
7 Free Tools for Data Pipeline Reconciliation and Cross-Source Validation
-
A note from our sponsor - SaaSHub
www.saashub.com | 13 Jun 2026
Index
What are some of the best open-source Kafka projects? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | pathway | 63,006 |
| 2 | data-engineering-zoomcamp | 42,355 |
| 3 | Apache Kafka | 32,807 |
| 4 | SpringBoot-Labs | 20,089 |
| 5 | Telegraf | 17,619 |
| 6 | C++ Workflow | 14,369 |
| 7 | debezium | 12,813 |
| 8 | sarama | 12,493 |
| 9 | redpanda | 12,201 |
| 10 | kafka-ui | 12,055 |
| 11 | kubeshark | 11,949 |
| 12 | kafka-manager | 11,938 |
| 13 | automq | 10,005 |
| 14 | watermill | 9,750 |
| 15 | risingwave | 9,077 |
| 16 | connect | 8,678 |
| 17 | kafka-go | 8,570 |
| 18 | DevOps-Bash-tools | 8,295 |
| 19 | graylog | 8,061 |
| 20 | CAP | 7,088 |
| 21 | Faust | 6,823 |
| 22 | flink-cdc | 6,432 |
| 23 | materialize | 6,316 |