-
llmwhisperer-pdf-checkbox-processing
Demonstration and companion repo that shows how to process PDF form elements like checkboxes and radiobuttons with LLMWhisperer
The source code for the project can be found here on GitHub. To successfully run the extraction script, you’ll need 2 different API keys. One for LLMWhisperer and the other for OpenAI APIs. Please be sure to read the Github project’s README to fully understand OS and other dependency requirements. You can sign up for LLMWhisperer, get your API key, and process up to 100 pages per day free of charge.
-
Stream
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
-
If you carefully think about it, the system that extracts raw text from the PDF needs to both detect and render PDF form elements like checkboxes and radiobuttons in a way that LLMs can understand. In this example, we’ll use LLMWhisperer to extract PDF raw text representing checkboxes and radiobuttons. You can use LLMWhisperer completely free for processing up to 100 pages per day. As for structuring the output from LLMWhisperer, we’ll use GPT3.5-Turbo and we’ll use Langchain and Pydantic to help make our job easy.
-
If you carefully think about it, the system that extracts raw text from the PDF needs to both detect and render PDF form elements like checkboxes and radiobuttons in a way that LLMs can understand. In this example, we’ll use LLMWhisperer to extract PDF raw text representing checkboxes and radiobuttons. You can use LLMWhisperer completely free for processing up to 100 pages per day. As for structuring the output from LLMWhisperer, we’ll use GPT3.5-Turbo and we’ll use Langchain and Pydantic to help make our job easy.