PDF-Extract-Kit Alternatives
Similar projects and alternatives to PDF-Extract-Kit
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
FLiPStackWeekly
FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
-
PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
-
-
-
document-ai-samples
Sample applications and demos for Document AI, the end-to-end document processing platform on Google Cloud
-
PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
-
-
AI-in-a-Box
AI-in-a-Box leverages the expertise of Microsoft across the globe to develop and provide AI and ML solutions to the technical community. Our intent is to present a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction.
-
-
wdoc
Summarize and query from a lot of heterogeneous documents. Any LLM provider, any filetype, scalable, under developpement
-
-
-
-
MinerU
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
-
-
AIM-Partioning
Milvus, AI, Python, JSON, REST, Zilliz, RSS, Articles, HuggingFace, Embedding, Partitioning
-
PDF-Extract-Kit discussion
PDF-Extract-Kit reviews and mentions
-
AIM Weekly for 23 September 2024
📎 Scaling Databases for GenAI 🤖 Streaming Vectors Webinar 📊 SQL, NoSQL, Vectors 📱 Super new model from MS - GRIN MoE 🛼 Salesforce Keynote 📢 RDBMS vs Vector DB 🐈⬛ DBTA Top 75 🌐 LitServe 📊 Langchain with Filtering 🖥️ PDF Extract Kit 👽 LLM Testing 🖥️ Easy Milvus Schema Generation 🌐Uber's Query GPT
- Ask HN: What are you using to parse PDFs for RAG?
Stats
opendatalab/PDF-Extract-Kit is an open source project licensed under GNU Affero General Public License v3.0 which is an OSI approved license.
The primary programming language of PDF-Extract-Kit is Python.