SaaSHub helps you find the best software and product alternatives Learn more →
Grobid Alternatives
Similar projects and alternatives to grobid
-
-
InfluxDB
Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.
-
txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
-
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
-
-
-
-
examples
Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc. (by towhee-io)
-
science-parse
Science Parse parses scientific papers (in PDF form) and returns them in structured form.
-
-
-
-
-
aleph
Search and browse documents and data; find the people and companies you look for. (by alephdata)
-
nlm-ingestor
This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
grobid discussion
grobid reviews and mentions
-
Open-source tool helps you convert PDF documents, web pages, etc., into Markdown
Anyone know how this compares to GROBID [1]? I'm looking at alternatives to GROBID as I'm not super pleased with its outputs. GROBID has a lot of great features for journal papers (reference extraction / parsing), but I'm only interested in cleanly extracting the body. Also considering nougat [2] but I haven't tried it yet.
[1] https://github.com/kermitt2/grobid
[2] https://github.com/facebookresearch/nougat
- FLaNK-AIM Weekly 06 May 2024
- Show HN: Open-source Rule-based PDF parser for RAG
- How to ingest image based PDFs into private GPT model?
- 🥪 Best Sites For ebooks, articles, research papers etc..🥪
- Grobid – ML software for extracting information from scholarly documents
-
How to create a web app that turns academic papers into text documents
Interesting concept. Grobid tries to do the same https://github.com/kermitt2/grobid
-
Extract research paper`s references
I would suggest using grobid - a pipeline for extracting scientific PDFs into a common XML format which can be easily parsed. Grobid has quite a nice mature REST API that I've used in some of my own projects. It parses references and matches them to their DOI using the CrossRef API with a reported 95% F1 score. This should make your job pretty simple as far as I can tell - all you'd need to do is run your papers through grobid and then build a citation graph by comparing document DOIs.
- Free/open-source alternatives to Connected Papers...?
-
Seeking Advice: How to extract Abstract from scientific journals (.pdfs) 10k+.
Just use science-parse or GROBID. They have been designed for that exact reason.
-
A note from our sponsor - SaaSHub
www.saashub.com | 8 Sep 2024
Stats
kermitt2/grobid is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of grobid is Java.