SaaSHub helps you find the best software and product alternatives Learn more →
Wdoc Alternatives
Similar projects and alternatives to wdoc
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
-
-
PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
-
Sandstorm
Sandstorm is a self-hostable web productivity suite. It's implemented as a security-hardened web app package manager.
-
-
gcodepreview
OpenPythonSCAD library for moving a tool in lines and arcs so as to model how a part would be cut using G-Code or described as a DXF.
-
syft
CLI tool and library for generating a Software Bill of Materials from container images and filesystems
-
-
-
-
-
-
-
document-ai-samples
Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
-
tldw
tl/dw (Too Long, Didn't Watch): Your Personal Research Multi-Tool - a naive attempt at 'A Young Lady's Illustrated Primer' (by rmusser01)
-
PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
-
wdoc discussion
wdoc reviews and mentions
-
Ask HN: What have you built with LLMs?
Here's a highlight (edit: more like an ego dump)
I couldn't keep up with my news so I made the perfect summarizer that goes through the thought process of the author : https://github.com/thiswillbeyourgithub/WDoc
I needed an AI based system that go through my anki cards, but might as well make it able to read dozens of file formats. Now I can put entire medical youtube playlists, conferences, anki databases, hundreds of PDFs and ask a single question across all of them at once .
It's both the same project
-
Ask HN: What are you using to parse PDFs for RAG?
For my RAG projet [WDoc](https://github.com/thiswillbeyourgithub/WDoc/tree/dev) I use multiple pdf parser then use heuristics the keep the best one. The code is at https://github.com/thiswillbeyourgithub/WDoc/blob/654c05c5b2...
And the heurstics are partly based on using fasttext to detecr languages : https://github.com/thiswillbeyourgithub/WDoc/blob/654c05c5b2...
It's probably crap for tables but I don't want to rely on external parsers.
- Ask HN: Is there any software you only made for your own use but nobody else?
-
Ask HN: I have many PDFs – what is the best local way to leverage AI for search?
Don't hesitate to ask for features!
Here's the link: https://github.com/thiswillbeyourgithub/DocToolsLLM/
-
A note from our sponsor - SaaSHub
www.saashub.com | 15 Oct 2024
Stats
thiswillbeyourgithub/wdoc is an open source project licensed under GNU General Public License v3.0 only which is an OSI approved license.
The primary programming language of wdoc is Shell.