Java ETL

Open-source Java projects categorized as ETL

Top 6 Java ETL Projects

  • airbyte

    Data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes.

    Project mention: What are your thoughts on projects using the Elastic License? | | 2023-01-26

    Doing a quick GitHub search reveals quite a few projects using the ELv2 license, including Airbyte and InvoiceNinja. Elastic (the company) aside, what are your thoughts on the Elastic License v2? Does your employer allow projects with an ELv2 license? Do you consider it open source? I understand that it's not OSI approved, but wondering where people stand when it comes to commercial open source software.

  • zingg

    Scalable identity resolution, entity resolution, data mastering and deduplication using ML

    Project mention: Ask HN: What is the most impactful thing you've ever built? | | 2022-11-18

    As part of my data consulting, I struggled with identity resolution and started working on scalable no code identity resolution - . It has pushed my limits as a software engineer and product builder, and I had to do a lot of learning to build it. Its cool to see people use Zingg in their workflows and save months of working on custom solutions. Big highlight has been North Carolina Open Campaign Data

  • Sonar

    Write Clean Java Code. Always.. Sonar helps you commit clean code every time. With over 600 unique rules to find Java bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • Smooks

    Extensible data integration Java framework for building XML and non-XML fragment-based applications

  • kafka-connect-file-pulse

    🔗 A multipurpose Kafka Connect connector that makes it easy to parse, transform and stream any file, in any format, into Apache Kafka

  • neo4j-jdbc

    JDBC driver for Neo4j

    Project mention: How can an opensource GPLv3/GPLv2 database (such as Neo4j or Virtuoso) be distributed alongside a proprietary software? | | 2022-11-29

    Scenario 1) The application uses Neo4j GPLv3 database alongside Neo4j's own exclusive query language called Cypher. The program will have some relevant part of its functionality written in CypherQL even though it connects to the database using an Apache 2.0 licensed driver.

  • dcc-import

    Reference data importers

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-01-26.

Java ETL related posts


What are some of the best open-source ETL projects in Java? This list will help you:

Project Stars
1 airbyte 9,359
2 zingg 691
3 Smooks 356
4 kafka-connect-file-pulse 237
5 neo4j-jdbc 110
6 dcc-import 1
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives