Java Document Processing

Open-source Java projects categorized as Document Processing

Top 5 Java Document Processing Projects

Document Processing
  1. docx4j

    JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. Apache POI

    Mirror of Apache POI gitbox. The Java API for Microsoft Documents.

    Project mention: Apache POI: A Beacon of Open Source Innovation and Sustainable Funding | dev.to | 2025-03-06

    Apache POI’s business model is inherently collaborative. The development process is open, with contributions published on its GitHub repository. Developers worldwide contribute code, documentation, and insights to ensure that Apache POI remains responsive to the ever-changing needs of technology. The openness fosters a level of trust and engagement that traditional proprietary software often lacks. One of the key enabling factors behind Apache POI's sustained success is its funding strategy. The project benefits from both large-scale corporate sponsorship and grassroots community contributions. Many enterprises depend on Apache POI for their mission-critical operations, and their financial backing is a testament to the tool’s reliability and value. Additionally, the open source community has embraced newer funding models, such as those involving token-based incentives seen on platforms discussing tokenizing open source licenses. The practical impact of these funding models is substantial. Transparent financial records and regular community engagement sessions reinforce the project’s accountability and vision. This, in turn, builds up confidence among sponsors and users, ensuring that Apache POI remains a driving force in the open source ecosystem.

  4. fastexcel

    Generate and read big Excel files quickly

  5. formkiq-core

    A full-featured Document Management Platform / Document Layer for your application, providing storage, discovery, processing, and retrieval. Deploys directly into your Amazon Web Services Cloud. Please 🌟 star to support our work!

  6. zerocell

    Simple, efficient Excel to POJO library for Java

  7. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java Document Processing discussion

Log in or Post with

Java Document Processing related posts

  • Apache POI: A Beacon of Open Source Innovation and Sustainable Funding

    1 project | dev.to | 6 Mar 2025
  • It seems like almost everyone here is working on a SaaS for other SaaS bootstrappers —- is anyone building a product for a vertical outside of email/marketing/forms/dev tools/productivity?

    1 project | /r/SaaS | 6 Jun 2023
  • Anyone using AI for enterprise content management?

    1 project | /r/managers | 31 May 2023
  • [D] Is there any way to filter searches by metadata over current vector DBs like Pinecone?

    2 projects | /r/MachineLearning | 30 May 2023
  • Does anyone have ideas on how to reach out to other startups to pitch our startup program?

    1 project | /r/startups | 19 Apr 2023
  • Show HN: Build your perfect document management system using Open Core software

    1 project | news.ycombinator.com | 19 Apr 2023
  • Email filing & automation methods & systems

    1 project | /r/paralegal | 12 Apr 2023
  • A note from our sponsor - Stream
    getstream.io | 13 Jul 2025
    Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure. Learn more →

Index

What are some of the best open-source Document Processing projects in Java? This list will help you:

# Project Stars
1 docx4j 2,228
2 Apache POI 2,041
3 fastexcel 788
4 formkiq-core 131
5 zerocell 81

Sponsored
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io

Did you know that Java is
the 8th most popular programming language
based on number of references?