Java event-streaming

Open-source Java projects categorized as event-streaming

Top 7 Java event-streaming Projects

  • Apache Pulsar

    Apache Pulsar - distributed pub-sub messaging system

    Project mention: Query Real Time Data in Kafka Using SQL | dev.to | 2023-03-23

    RisingWave is an open-source distributed SQL database for stream processing. RisingWave accepts data from sources like Apache Kafka, Apache Pulsar, Amazon Kinesis, Redpanda, and databases via native Change data capture connections to MySQL and PostgreSQL sources. It uses the concept of materialized view that involves caching the outcome of your query operations and it is quite efficient for long-running stream processing queries.

  • debezium

    Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

    Project mention: PostgreSQL Logical Replication Explained | news.ycombinator.com | 2023-03-18

    Logical replication is also great for replicating to other systems - for example Debezium [1] that writes all changes to a Kafka stream.

    I'm using it to develop a system to replicate data to in-app SQLite databases, via an in-between storage layer [2]. Logical replication is quite a low-level tool with many tricky cases, which can be difficult to handle when integrating with it directly.

    Some examples:

    1. Any value over 8KB compressed (configurable) is stored separately from the rest of the row (TOAST storage), and unchanged values included in the replicated record by default. You need to keep track of old values in the external system, or use REPLICA IDENTITY FULL (which adds a lot of overhead on the source database).

    2. PostgreSQL's primary keys can be pretty-much any combination of columns, and may or may not be used as the table's replica identity, and it may change at any time. If "REPLICA IDENTITY FULL" is used, you don't even have an explicit primary key on the receiver side - the entire record is considered the identity. Or with "REPLICA IDENTITY NOTHING", there is no identity - every operation is treated as an insert. The replica identity is global per table, so if logical replication is used to replicate to multiple systems, you may not have full control over it. This means many different combinations of replica identity needs to be handled.

    3. For initial sync you need to read the tables directly. It takes extra effort to make sure these are replicated in the same way as with incremental replication - for example taking into account the list of published tables, replica identity, row filters and column lists.

    4. Depending on what is used for high availability, replication slots may get lost in a fail-over event, meaning you'll have to re-sync all data from scratch. This includes cases where physical or logical replication is used. The only case where this is not an issue is where the underlying block storage is replicated, which is the case in AWS RDS for example.

    [1]: https://debezium.io

    [2]: https://powersync.co

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • kafka-ui

    Open-Source Web UI for Apache Kafka Management

    Project mention: Unable to connect localhost client to Kafka broker running in docker | reddit.com/r/apachekafka | 2023-03-09

    Hey, I have an example like that here, this kafka is easily accessible from outside of the docker network via localhost:9097.

  • kafdrop

    Kafka Web UI

    Project mention: Kafka visualization tool | reddit.com/r/apachekafka | 2023-02-15
  • EventMesh

    EventMesh is a new generation serverless event middleware for building distributed event-driven applications.

  • kop

    Kafka-on-Pulsar - A protocol handler that brings native Kafka protocol to Apache Pulsar

    Project mention: Interview question | reddit.com/r/java | 2023-02-23

    I process hospitality data somewhat similarly, but use Pulsar and can individually acknowledge messages, have DLQs built in, and if needed can stream just like Kafka. And if I need Kafka compatibility, I can use something like StreamNative's KOP and get Kafka compatibility over my existing Pulsar queues.

  • pulsar-recipes

    A StreamNative library containing a collection of recipes that are implemented on top of the Pulsar client to provide higher-level functionality closer to the application domain.

    Project mention: December 5, 2022: FLiP Stack Weekly | dev.to | 2022-12-03
  • Sonar

    Write Clean Java Code. Always.. Sonar helps you commit clean code every time. With over 600 unique rules to find Java bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-03-23.

Java event-streaming related posts

Index

What are some of the best open-source event-streaming projects in Java? This list will help you:

Project Stars
1 Apache Pulsar 12,438
2 debezium 8,246
3 kafka-ui 5,248
4 kafdrop 4,374
5 EventMesh 1,185
6 kop 386
7 pulsar-recipes 5
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com