Scalable Processing of Swiss PDF Documents using 2D Barcodes on AWS

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • document-understanding-solution

    Discontinued Example of integrating & using Amazon Textract, Amazon Comprehend, Amazon Comprehend Medical, Amazon Kendra to automate the processing of documents for use cases such as enterprise search and discovery, control and compliance, and general business process workflow.

  • We show how to build a simple and scalable solution which can process swiss documents, like Zurich tax statements, salary statements, and QR invoices using 2D barcodes. In particular, we extend Document understanding solution (DUS) from AWS Labs to support the processing of these swiss document types by using 2D barcodes. Figure 1 shows two supported example documents, a swiss salary statement and a Zurich tax statement.

  • aws-lambda-java-libs

    Official mirror for interface definitions and helper classes for Java code running on the AWS Lambda platform.

  • A change to DynamoDB table triggers an event, which gets processed by an AWS Lambda function and adds the file to the appropriate sync queue (image) and async queue (pdf).

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • document-barcodes

    Docbarcodes extracts 1D and 2D barcodes from scanned PDF documents or images. It can be used to automate extraction and processing of all kind of documents.

  • We use the docbarcodes package for the barcode extraction process. The extraction consists of multiple steps, since a document can have multiple barcodes distributed over various regions of the page.

  • ZXing

    ZXing ("Zebra Crossing") barcode scanning library for Java, Android

  • Extract the raw barcode from candidate regions: The extraction of barcodes from the candidate regions is being performed with zxing, an open source library which supports many variations of 1D and 2D barcodes.

  • document-understanding-solution

    Example of integrating & using Amazon Textract, Amazon Comprehend, Amazon Comprehend Medical, Amazon Kendra to automate the processing of documents for use cases such as enterprise search and discovery, control and compliance, and general business process workflow. (by ArlindNocaj)

  • The following code shows the main part of the AWS Lambda Barcode Processing function in python. The full code will be available soon in the DUS dev branch. You can preview it at the forked branch DUS dev barcodes

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts