Java Batch

Open-source Java projects categorized as Batch

Top 4 Java Batch Projects

  • beam

    Apache Beam is a unified programming model for Batch and Streaming data processing.

  • Project mention: Ask HN: Does (or why does) anyone use MapReduce anymore? | news.ycombinator.com | 2024-01-24

    The "streaming systems" book answers your question and more: https://www.oreilly.com/library/view/streaming-systems/97814.... It gives you a history of how batch processing started with MapReduce, and how attempts at scaling by moving towards streaming systems gave us all the subsequent frameworks (Spark, Beam, etc.).

    As for the framework called MapReduce, it isn't used much, but its descendant https://beam.apache.org very much is. Nowadays people often use "map reduce" as a shorthand for whatever batch processing system they're building on top of.

  • seatunnel

    SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

  • Project mention: SeaTunnel – super high-performance, distributed data integration tool | news.ycombinator.com | 2024-04-28
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • easy-batch

    The simple, stupid batch framework for Java

  • java-dataloader

    A Java 8 port of Facebook DataLoader

  • Project mention: GraphQL Java Data Loader | dev.to | 2023-12-21

    https://www.graphql-java.com/documentation/batching/ https://github.com/graphql-java/java-dataloader

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java Batch related posts

  • Ask HN: Does (or why does) anyone use MapReduce anymore?

    2 projects | news.ycombinator.com | 24 Jan 2024
  • How do Streaming Aggregation Pipelines work?

    1 project | /r/dataengineering | 6 Dec 2023
  • Releasing Temporian, a Python library for processing temporal data, built together with Google

    2 projects | /r/Python | 17 Sep 2023
  • Kafka cluster loses or duplicates messages

    1 project | /r/codehunter | 27 Apr 2023
  • Apache Beam

    1 project | news.ycombinator.com | 24 Apr 2023
  • Composer out of resources - "INFO Task exited with return code Negsignal.SIGKILL"

    1 project | /r/googlecloud | 17 Aug 2022
  • Pub/Sub parallel processing best practices

    1 project | /r/googlecloud | 28 Jul 2022
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 1 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Batch projects in Java? This list will help you:

Project Stars
1 beam 7,519
2 seatunnel 7,223
3 easy-batch 600
4 java-dataloader 486

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com