obsidian-omnisearch
OCRmyPDF
obsidian-omnisearch | OCRmyPDF | |
---|---|---|
17 | 77 | |
997 | 12,067 | |
- | 2.2% | |
8.9 | 9.5 | |
20 days ago | 9 days ago | |
TypeScript | Python | |
GNU General Public License v3.0 only | Mozilla Public License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
obsidian-omnisearch
-
"Your account will be permanently closed" -- my reasons for leaving Evernote as a loyal user since 2011
Mobile Document Scanning using QuickScan iOS app and OCR search with Omnisearch and Text Extractor: I was a power user of the Scannable app by Evernote for capturing scans of receipts and documents, so moving on from this was going to be tough. But QuickScan has the same functionality (OCR scanning) and has quick outputs to where my scans are stored in my Obsidian folder. Using Omnisearch, searching my scans feels just as intuitive and snappy as what Evernote used to feel like for me.
-
Obsidian-Copilot: A Prototype Assistant for Writing and Thinking
In the past I have used Omnisearch which I have found to be an improvement.
https://github.com/scambier/obsidian-omnisearch
-
Tip: Use an Obsidian folder to store your ChatGPT threads
Combine this with my favorite Obsidian search plugin Omnisearch and you end up making this bunch of random chat threads useful - now I can link and tag across, and source them for new ideas.
-
Using Github to write my notes has helped me retain knowledge immensely.
The Omnisearch plugin might be what you need. No AI but weighted results depending on where your query words are found (filename, titles, frequency...). It works well for me, it's my primary way to find notes.
-
Why do you think Obsidian is better than the alternatives?
The tag system works well for GTD workflows and organization in general. Default search isn't the best but the Omnisearch plugin fixes that.
-
Search & Omnisearch frustrations - prioritizing exact matches over fuzzy search?
Also - and speaking about plugins in general - the best way to get an issue resolved is to ask it on the GitHub page. If the plugin is maintained, its developer will usually gladly help you solve your problem :) https://github.com/scambier/obsidian-omnisearch/issues
-
Is there a way to search for a word or phrase just in the current note?
I think Obsidian Omnisearch can help you with that.
-
Perfect note taking and information organizing solution - does it exist ?
The Omnisearch plug-in for Obsidian does search in PDFs and images via OCR.
- Digitalizing 10 years of handwritten notes -- how would you go about doing it?
-
PDF notes in Obsidian with Zotero
In my opinion it is absolutely possible. The developer of the Omnisearch plugin now works on PDF indexing - https://github.com/scambier/obsidian-omnisearch/releases/tag/1.6.5-beta.3.
OCRmyPDF
-
TextSnatcher: Copy text from images, for the Linux Desktop
Try https://github.com/ocrmypdf/OCRmyPDF - it uses Tesseract behind the scenes and it absolutely brilliant.
- FLaNK Stack Weekly 19 Feb 2024
-
Calibre – New in Calibre 7.0
I recommend running any such PDFs through OCRmyPDF.
https://github.com/ocrmypdf/OCRmyPDF
-
A better document viewer
If by "like a photocopy" you mean the file contains images of text rather than text, the MacOS viewer presumably does OCR on the images. I don't know if there's a Linux document viewer with that capability built-in, but a quick search turned up the standalone tool OCRmyPDF.
- Gibts ein (CLI) tool, das Kontrast und Helligkeit von gescannten Textdokumenten dynamisch anpasst?
-
OCR for a full pdf on Neoreader
For anyone interested I solved the problem by first ocr files through the free and open source software ocrmypdf avaible here
-
ELI5: why is PDF such a widespread text format, instead of a format that's actually easier to edit?
ocrmypdf is nice for stuff like that.
- Donut: OCR-Free Document Understanding Transformer
-
massive crop and OCR newspaper
Use imagemagick to convert them to PDF and ocrmypdf to straighten and OCR. See this explanation.
-
OCR pdf and just keep the OCR text
Fair enough, maybe this might work for you, it should seperate the text from image anyway and if you have Adobe acrobat it should be able delete the background too with the edit function. It may already be able to do that if you haven't tried it
What are some alternatives?
obsidian-switcher-plus - Enhanced Quick Switcher plugin for Obsidian.md
PaddleOCR - Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
obsidian-customizable-sidebar - This Plugin allows you to add every Command to Obsidian's Sidebar Ribbon and add Custom Icons.
pdfplumber - Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
cMenu-Plugin - An Obsidian.md plugin that adds a minimal text editor modal for a smoother writing/editing experience ✍🏽.
tesserocr - A Python wrapper for the tesseract-ocr API
ObsidianCustomFrames - An Obsidian plugin that turns web apps into panes using iframes with custom styling. Also comes with presets for Google Keep, Todoist and more.
Paperless-ng - A supercharged version of paperless: scan, index and archive all your physical documents
remotely-save - Yet another unofficial Obsidian plugin allowing users to synchronize notes between local device and the cloud service. Supports S3, Dropbox, OneDrive, webdav.
invoice2data - Extract structured data from PDF invoices
minisearch - Tiny and powerful JavaScript full-text search engine for browser and Node
pdfminer.six - Community maintained fork of pdfminer - we fathom PDF