SaaSHub helps you find the best software and product alternatives Learn more →
Top 17 Java Data Projects
-
Project mention: Using IRIS and Presto for high-performance and scalable SQL queries | dev.to | 2025-01-19
The rise of Big Data projects, real-time self-service analytics, online query services, and social networks, among others, have enabled scenarios for massive and high-performance data queries. In response to this challenge, MPP (massively parallel processing database) technology was created, and it quickly established itself. Among the open-source MPP options, Presto (https://prestodb.io/) is the best-known option. It originated in Facebook and was utilized for data analytics, but later became open-sourced. However, since Teradata has joined the Presto community, it offers support now.
-
Nutrient
Nutrient - The #1 PDF SDK Library. Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.
-
kestra
:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
Project mention: Study Notes 2.2.7: Managing Schedules and Backfills with BigQuery in Kestra | dev.to | 2025-02-04Kestra Documentation: Kestra.io
-
When I manage a project and have the freedom to choose my configuration structure, then I always use typescript. I never understood the desire to have configuration be in ini/json/jsonnet/yaml. A strongly typed configuration with code completion seems so much more robust. Except of course your usecase is to load or change the config via an API.
I like what apple is doing with https://pkl-lang.org/ though.
-
data-transfer-project
The Data Transfer Project makes it easy for platforms to build interoperable user data portability features. We are establishing a common framework, including data models and protocols, to enable direct transfer of data both into and out of participating online service providers.
-
-
Project mention: Polaris Catalog: An Open Source Catalog for Apache Iceberg | news.ycombinator.com | 2024-06-03
-
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
-
-
rapiddweller-benerator-ce
BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.
-
pgCompare
pgCompare – a straightforward utility crafted to simplify the data comparison process, providing a robust solution for comparing data across various database platforms.
Project mention: Show HN: PgCompare – Data comparison made simple | news.ycombinator.com | 2024-06-02 -
-
-
Db4o-gpl
new Db4o GPL Source Code for Java7+ & .netstardard2.0 Android Xamarin..., the best database project to help you to learn how to make databases
-
SheetsIO
Small configurable Java app that pulls data from a Google Spreadsheet (using v4 api) and writes to files and a local webserver.
-
Data-Structures-and-Algorithms
Solutions to Arrays, Strings, Lists, Sorting, Stacks, Trees and General DS problems using JAVA. (by anishkumar127)
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Java Data discussion
Java Data related posts
-
Polaris Catalog: An Open Source Catalog for Apache Iceberg
-
Show HN: PgCompare – Data comparison made simple
-
A deep dive into the concept and world of Apache Iceberg Catalogs
-
Apple releases Pkl – onfiguration as code language
-
Multi-Database Support in DuckDB
-
Why is Hive Metastore everywhere? (Especially Iceberg)
-
Missouri trans 'snitch form' down after people spammed it with the 'Bee Movie' script
-
A note from our sponsor - SaaSHub
www.saashub.com | 19 Feb 2025
Index
What are some of the best open-source Data projects in Java? This list will help you:
# | Project | Stars |
---|---|---|
1 | Presto | 16,196 |
2 | kestra | 15,838 |
3 | pkl | 10,487 |
4 | data-transfer-project | 3,576 |
5 | proteus | 1,306 |
6 | nessie | 1,129 |
7 | jimmer | 1,087 |
8 | micronaut-data | 471 |
9 | riot | 295 |
10 | rapiddweller-benerator-ce | 147 |
11 | pgCompare | 122 |
12 | ModelRunner | 56 |
13 | nextcloud-tables | 40 |
14 | Db4o-gpl | 30 |
15 | SheetsIO | 23 |
16 | Data-Structures-and-Algorithms | 12 |
17 | SparkDB | 3 |