Extract research paper`s references

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

grobid

11 3,057 9.2 Java

A machine learning software for extracting information from scholarly documents

I would suggest using grobid - a pipeline for extracting scientific PDFs into a common XML format which can be easily parsed. Grobid has quite a nice mature REST API that I've used in some of my own projects. It parses references and matches them to their DOI using the CrossRef API with a reported 95% F1 score. This should make your job pretty simple as far as I can tell - all you'd need to do is run your papers through grobid and then build a citation graph by comparing document DOIs.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Grobid – ML software for extracting information from scholarly documents

1 project | news.ycombinator.com | 21 Apr 2023
How to create a web app that turns academic papers into text documents

1 project | /r/webdev | 16 Jan 2023
Grobid: Machine learning for extracting information from scholarly documents

2 projects | news.ycombinator.com | 16 Jun 2021
Free/open-source alternatives to Connected Papers...?

2 projects | /r/opensource | 12 Aug 2022
[D] What pdf parser do you use for paragraph parsing for huggingface models

2 projects | /r/MachineLearning | 13 Jul 2021

Extract research paper`s references

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology
Machine Learning scientific-articles PDF Metadata Fulltext
Post date: 1 Jan 2023

grobid

InfluxDB

Related posts

Grobid – ML software for extracting information from scholarly documents

How to create a web app that turns academic papers into text documents

Grobid: Machine learning for extracting information from scholarly documents

Free/open-source alternatives to Connected Papers...?

[D] What pdf parser do you use for paragraph parsing for huggingface models

Extract research paper`s references

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology Machine Learning scientific-articles PDF Metadata Fulltext Post date: 1 Jan 2023

grobid

InfluxDB

Related posts

Grobid – ML software for extracting information from scholarly documents

How to create a web app that turns academic papers into text documents

Grobid: Machine learning for extracting information from scholarly documents

Free/open-source alternatives to Connected Papers...?

[D] What pdf parser do you use for paragraph parsing for huggingface models

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology
Machine Learning scientific-articles PDF Metadata Fulltext
Post date: 1 Jan 2023