paper-qa
paperetl
paper-qa | paperetl | |
---|---|---|
11 | 12 | |
3,719 | 321 | |
- | 1.9% | |
8.6 | 6.3 | |
6 days ago | 7 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
paper-qa
-
WARC-GPT: An Open-Source Tool for Exploring Web Archives Using AI
https://github.com/whitead/paper-qa
> LLM Chain querying a scientific Zotero library, with citations
pip install paper-qa
-
Oracle of Zotero: LLM QA of Your Research Library
Why does this post link to a renamed fork of Paper-QA (https://github.com/whitead/paper-qa) which has made zero changes and is 19 commits behind the original?
-
[P] A Large Language Model for Healthcare | NHS-LLM and OpenGPT
To be honest, I'm not too sure about this part, and think that it is probably not the best approach to have the model itself generate references. I prefer the approach used in e.g. paperqa, but wanted to explore both options.
-
Looking for a paper summarizer
I’ve come across Paper QA (github page) and as a graduate student I loved the idea that when I do literature review and find tons of papers I can just ask the AI to find the info I’m looking for in the paper. However, this service requires OpenAI API key, which I’ve acquired but turns out it’s a paid service. Free key doesn’t get me anything. Is there a service/software like this that is free? Or something that I can host on my PC instead of using people’s servers so it’s cheaper/free?
-
ChatPDF – Chat with Any PDF
I tried it [1] a lot, but I must say it confuses me most of the time and I need to read the original text to check if it makes sense. Lots of times it doesn't.
[1] https://github.com/whitead/paper-qa
- Alternatives to Pinecone? (Vector databases) [D]
-
DIY natural language processing - How to start, techniques guidance
Have a look at this: https://github.com/whitead/paper-qa
-
Show HN: Document Q&A with GPT: web, .pdf, .docx, etc.
1: We are finding out. Someone else mentioned: https://github.com/whitead/paper-qa We're hoping to keep our service be accessible and easy to use, and add features. Such as from your other questions...
2: We are thinking of the website integration. Do you think OpenAI may release this too? Questions received by email is a new idea that sounds interesting!
3: Thanks for the suggestion – we will look into it.
- GitHub - whitead/paper-qa: LLM Chain for answering questions from documents with citations
- Paper QA: LLM Chain for answering questions from documents with citations
paperetl
- Show HN: Open-source Rule-based PDF parser for RAG
-
Oracle of Zotero: LLM QA of Your Research Library
Nice project!
I've spent quite a lot of time in the medical/scientific literature space. With regards to LLMs, specifically RAG, how the data is chunked is quite important. With that, I have a couple projects that might be beneficial additions.
paperetl (https://github.com/neuml/paperetl) - supports parsing arXiv, PubMed and integrates with GROBID to handle parsing metadata and text from arbitrary papers.
paperai (https://github.com/neuml/paperai) - builds embeddings databases of medical/scientific papers. Supports LLM prompting, semantic workflows and vector search. Built with txtai (https://github.com/neuml/txtai).
While arbitrary chunking/splitting can work, I've found that integrating parsing that has knowledge of medical/scientific paper structure increases the overall accuracy and experience of downstream applications.
-
[P] Parse research papers into structured data
paperai | paperetl
- Parse research papers into a structured dataset
- ETL for medical and scientific papers
- Show HN: ETL for Medical and Scientific Papers
-
Seeking Advice: How to extract Abstract from scientific journals (.pdfs) 10k+.
paperai and paperetl are a set of projects to consider for this task.
- paperetl: ETL processes for medical and scientific papers
What are some alternatives?
vault-ai - OP Vault ChatGPT: Give ChatGPT long-term memory using the OP Stack (OpenAI + Pinecone Vector Database). Upload your own custom knowledge base files (PDF, txt, epub, etc) using a simple React frontend.
SciencePlots - Matplotlib styles for scientific plotting
simple-llm-finetuner - Simple UI for LLM Model Finetuning
tika-python - Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
gpt4-pdf-chatbot-langchain - GPT4 & LangChain Chatbot for large PDF docs
ciscoconfparse - Parse, Audit, Query, Build, and Modify Cisco IOS-style configurations.
langchain - ⚡ Building applications with LLMs through composability ⚡ [Moved to: https://github.com/langchain-ai/langchain]
paperai - 📄 🤖 Semantic search and workflows for medical/scientific papers
google-research - Google Research
rdm - Our regulatory documentation manager. Streamlines 62304, 14971, and 510(k) documentation for software projects.
OpenGPT - A framework for creating grounded instruction based datasets and training conversational domain expert Large Language Models (LLMs).
dagster - An orchestration platform for the development, production, and observation of data assets.