nlm-ingestor

This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats. (by nlmatics)

Nlm-ingestor Alternatives

Similar projects and alternatives to nlm-ingestor

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better nlm-ingestor alternative or higher similarity.

nlm-ingestor reviews and mentions

Posts with mentions or reviews of nlm-ingestor. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-06.
  • Pg_vectorize: The simplest way to do vector search and RAG on Postgres
    6 projects | news.ycombinator.com | 6 Mar 2024
    >tree-based approach to organize and summarize text data, capturing both high-level and low-level details.

    https://twitter.com/parthsarthi03/status/1753199233241674040

    processes documents, organizing content and improving readability by handling sections, paragraphs, links, tables, lists, page continuations, and removing redundancies, watermarks, and applying OCR, with additional support for HTML and other formats through Apache Tika:

    https://github.com/nlmatics/nlm-ingestor

  • Show HN: Open-source Rule-based PDF parser for RAG
    9 projects | news.ycombinator.com | 23 Jan 2024
    Here's another notebook from the repo with examples: https://github.com/nlmatics/nlm-ingestor/blob/main/notebooks...

Stats

Basic nlm-ingestor repo stats
3
810
7.1
17 days ago

nlmatics/nlm-ingestor is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of nlm-ingestor is Python.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com