-
ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
it's https://huggingface.co/InfiniFlow/deepdoc and the code for usage is in https://github.com/infiniflow/ragflow/blob/main/deepdoc/READ... – it took me a bit of trial and error to get it working
It seems to be a YOLOv8 fine-tune, I only did a couple tests but results were decent. Another model that is supposed to be fine tuned for borderless is https://huggingface.co/keremberke/yolov8m-table-extraction but I haven't had great results myself with it, but maybe worth a try for you.
If anyone is interested in exploring this space, try another similar tool LLMWhisperer (https://llmwhisperer.unstract.com/). It is a part of Unstract, an open-source document processing tool (https://github.com/Zipstack/unstract)
Related posts
-
Show HN: LLMWhisperer – Prep complex documents ready for use in LLMs
-
Ask HN: Is RAG the Future of LLMs?
-
Intelligently transform unstructured to structured output (JSON, Regex, CFG)
-
Semantic Cache: A fuzzy key value store based on semantic similarity
-
Ask HN: Affordable hardware for running local large language models?