Transformers-Tutorials vs llama2_aided_tesseract

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace. (by NielsRogge)

llama2_aided_tesseract

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections, complete with options for text validation and hallucination filtering. (by Dicklesworthstone)

Suggest topics

Source Code

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

Transformers-Tutorials		llama2_aided_tesseract
	Project
7	Mentions	4
7,510	Stars	195
-	Growth	-
8.4	Activity	7.2
16 days ago	Latest Commit	9 months ago
Jupyter Notebook	Language	Python
MIT License	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

Transformers-Tutorials

Posts with mentions or reviews of Transformers-Tutorials. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-16.

AI enthusiasm #6 - Finetune any LLM you want💡
2 projects | dev.to | 16 Apr 2024

Most of this tutorial is based on Hugging Face course about Transformers and on Niels Rogge's Transformers tutorials: make sure to check their work and give them a star on GitHub, if you please ❤️
FLaNK Stack Weekly for 07August2023
27 projects | dev.to | 7 Aug 2023
How to annotate compound words to build NER models?
1 project | /r/LanguageTechnology | 31 May 2023
[discussion] Anybody Working with VITMAE?
1 project | /r/MachineLearning | 31 Mar 2023

I'm pretraining on 850K grayscale spectrograms of birdsongs. I'm on epoch 400 out of 800 and the loss has declined from about 1.2 to 0.7. I don't really have a sense of what is "good enough" and I guess the only way I can judge is by looking at the reconstruction. I'm doing that using this notebook as a guide and right now it's doing quite badly.
[D] NLP has HuggingFace, what does Computer Vision have?
7 projects | /r/MachineLearning | 19 Apr 2022

More tutorials can be found at https://github.com/NielsRogge/Transformers-Tutorials.
[Discussion] Information Extraction with LayoutLMv2
1 project | /r/MachineLearning | 30 Dec 2021

Ive been looking for an off the shelf encoder-decoder document understanding model for key information extraction. I found a great Huggingface implementation with concise notebook examples. However, the token classification model outputs a list of token labels corresponding bounding boxes for the token, but, not the text contained within the labeled bounding boxes themselves. Am I missing something? LayoutLMv2 describes itself as being capable of information extraction but without extracting the text I feel like it's fallen short of that ambition.
[Project] Deepmind's Perceiver IO available through Hugging Face
1 project | /r/MachineLearning | 16 Dec 2021

Example Notebooks

llama2_aided_tesseract

Posts with mentions or reviews of llama2_aided_tesseract. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-01.

Standard Ebooks
9 projects | news.ycombinator.com | 1 Jan 2024

I made a tool like that, and I bet with a more powerful LLM like GPT4, and perhaps a better baseline OCR tool (like GPT4 vision), it could work really well for this sort of thing:
https://github.com/Dicklesworthstone/llama2_aided_tesseract
Use Llama2 to Improve the Accuracy of Tesseract OCR
1 project | /r/programming | 17 Aug 2023
FLaNK Stack Weekly for 07August2023
27 projects | dev.to | 7 Aug 2023
Show HN: Using LLama2 to Correct OCR Errors
1 project | news.ycombinator.com | 2 Aug 2023

What are some alternatives?

When comparing Transformers-Tutorials and llama2_aided_tesseract you can also consider the following projects:

nn - 🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

harlequin - The SQL IDE for Your Terminal.

gorilla-cli - LLMs for your CLI

pytorch-image-models - PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNet-V3/V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

OpenBuddy - Open Multilingual Chatbot for Everyone

notebooks - Notebooks using the Hugging Face libraries 🤗

CallCMLModel - An example on calling models deployed in CML

adaptnlp - An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models.

EverythingApacheNiFi - EverythingApacheNiFi

fuzzy-matcher - A Java library to determine probability of objects being similar.