SaaSHub helps you find the best software and product alternatives Learn more →
Top 16 Java Flink Projects
-
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
LakeSoul
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
-
paimon
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
-
bitsail
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
-
flink-http-connector
Http Connector for Apache Flink. Provides sources and sinks for Datastream , Table and SQL APIs.
-
scotty-window-processor
This repository provides Scotty, a framework for efficient window aggregations for out-of-order Stream Processing.
-
cratedb-flink-jobs
This repository accompanies the article "Build a data ingestion pipeline using Kafka, Flink, and CrateDB" and the "CrateDB Community Day #2".
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
7. Apache Flink | Github | tutorial
Now we can proceed with the definition of Apache Zeppelin. It is a web-based notebook that enables data-driven, interactive data analytics and collaborative documents with Python, Scala, SQL, Spark, and more. You can execute code and even schedule a job (via cron) to run at regular intervals.
18. Apache Paimon | Github | tutorial
Project mention: Top 10 Common Data Engineers and Scientists Pain Points in 2024 | dev.to | 2024-04-11Data scientists often prefer Python for its simplicity and powerful libraries like Pandas or SciPy. However, many real-time data processing tools are Java-based. Take the example of Kafka, Flink, or Spark streaming. While these tools have their Python API/wrapper libraries, they introduce increased latency, and data scientists need to manage dependencies for both Python and JVM environments. For example, implementing a real-time anomaly detection model in Kafka Streams would require translating Python code into Java, slowing down pipeline performance, and requiring a complex initial setup.
Project mention: Auto-Synchronizing an Entire MySQL Database for Data Analysis | dev.to | 2023-09-01Download JAR file: https://github.com/apache/doris-flink-connector/releases/tag/1.4.0
Project mention: Implementing a “Lookback” Window Using Apache Flink’s KeyedProcessFunction | /r/RedditEng | 2023-10-10This concept is similar to a sliding window with a small step size, but with a more memory-efficient implementation. By using “slice sharing” instead of duplicating events into every overlapping window, the memory footprint is reduced. Scotty window processor is an open-source implementation of memory-efficient window aggregations with connectors for popular stream processors like Flink. This is a promising avenue for approximating a “lookback” window when aggregations like count, sum or histogram are required.
Java Flink related posts
- Top 10 Common Data Engineers and Scientists Pain Points in 2024
- Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog
- Pyflink : Flink DataStream (KafkaSource) API to consume from Kafka
- How do I determine what the dependencies are when I make pom.xml file?
- Akka is moving away from Open Source
- Computation reuse via fusion in Amazon Athena
- Avro SpecificRecord File Sink using apache flink is not compiling due to error incompatible types: FileSink<?> cannot be converted to SinkFunction<?>
-
A note from our sponsor - SaaSHub
www.saashub.com | 19 Apr 2024
Index
What are some of the best open-source Flink projects in Java? This list will help you:
Project | Stars | |
---|---|---|
1 | Apache Flink | 23,128 |
2 | Zeppelin | 6,261 |
3 | LakeSoul | 2,294 |
4 | paimon | 1,792 |
5 | SREWorks | 1,693 |
6 | bitsail | 1,575 |
7 | yauaa | 726 |
8 | flink-kubernetes-operator | 711 |
9 | flink-ml | 288 |
10 | doris-flink-connector | 276 |
11 | flink-faker | 194 |
12 | flink-remote-shuffle | 189 |
13 | hadoopcryptoledger | 142 |
14 | flink-http-connector | 118 |
15 | scotty-window-processor | 75 |
16 | cratedb-flink-jobs | 2 |