Java Data

Open-source Java projects categorized as Data

Top 13 Java Data Projects

  • kestra

    Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

    Project mention: A High-Performance, Java-Based Orchestration Platform | /r/java | 2023-10-11

    Kestra's communication is asynchronous and based on a queuing mechanism. It leverages the Micronaut framework and offers two runners: one that uses a database (JDBC) for both the message queue and resource storage, and another that uses Kafka as the message queue and Elasticsearch as the resource storage. The platform is fully extensible and plugin-based, providing a rich set of plugins for various workflow tasks, triggers, and data storage options. For those interested, the GitHub repository is available here:

  • data-transfer-project

    The Data Transfer Project makes it easy for people to transfer their data between online service providers. We are establishing a common framework, including data models and protocols, to enable direct transfer of data both into and out of participating online service providers.

    Project mention: Apple TV, now with more Tailscale | | 2023-09-18

    I would argue that it is exactly in line with Apple's brand identity.

    Pretty much everybody agrees that you need to backup your cloud storage as well as your local computer, and Apple even backs up your i-devices to the cloud, and yet, there is no automated way of backing up your iCloud storage.

    About a decade ago, Google initiated the Data Transfer Framework[1] that allows you to transfer data from one cloud provider to another, directly from provider to provider instead of downloading it first. It sadly appears to not have gotten enough traction to be of any use.


  • Onboard AI

    Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at

  • proteus

    Proteus : A JSON based LayoutInflater for Android

    Project mention: Am i safe by sticking with Java and XML for years ahead ? | /r/androiddev | 2023-06-04

    I guess it wouldn't be a first , but

  • nessie

    Nessie: Transactional Catalog for Data Lakes with Git-like semantics

    Project mention: Why is Hive Metastore everywhere? (Especially Iceberg) | /r/dataengineering | 2023-06-30

    Try Nessie - it recently got trino support as well ..

  • jimmer

    A revolutionary ORM framework for both java and kotlin.

  • riot

    🧨 Get data in & out of Redis with RIOT (by redis-developer)

    Project mention: Uploading Data from a CSV file | /r/redis | 2023-02-22

    Hi there - take a look at RIOT which might be helpful...

  • rapiddweller-benerator-ce

    BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.

  • InfluxDB

    Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.

  • ModelRunner

    No-code, model driven, natural language data access platform

  • Db4o-gpl

    new Db4o GPL Source Code for Java7+ & .netstardard2.0 Android Xamarin..., the best database project to help you to learn how to make databases

  • nextcloud-tables

    📊 Android client for nextcloud tables app

    Project mention: ⟳ 2 apps added, 54 updated at | /r/FDroidUpdates | 2023-06-11

    Nextcloud Tables (version 1.0.7): Companion app for Nextcloud Tables

  • SheetsIO

    Small configurable Java app that pulls data from a Google Spreadsheet (using v4 api) and writes to files and a local webserver.

  • Data-Structures-and-Algorithms

    Solutions to Arrays, Strings, Lists, Sorting, Stacks, Trees and General DS problems using JAVA. (by anishkumar127)

  • SparkDB

    CSV-to-database-structure project (by NaDeSys)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-10-11.

Java Data related posts


What are some of the best open-source Data projects in Java? This list will help you:

Project Stars
1 kestra 5,045
2 data-transfer-project 3,522
3 proteus 1,293
4 nessie 735
5 jimmer 483
6 riot 204
7 rapiddweller-benerator-ce 118
8 ModelRunner 57
9 Db4o-gpl 29
10 nextcloud-tables 22
11 SheetsIO 19
12 Data-Structures-and-Algorithms 11
13 SparkDB 3
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives