primeqa
pixel
primeqa | pixel | |
---|---|---|
5 | 4 | |
702 | 321 | |
0.4% | - | |
8.2 | 3.8 | |
1 day ago | about 2 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
primeqa
- State-of-the-Art Multilingual Question Answering
-
ML tool to read PDF file and answer questions from its content
Check out this project it might be of some help primeqa .
-
Natural language, chat-based, AI-assisted search for Gmail
Look into primeqa (github/primeqa. With some basic python programming you can do alot of things!!
- PrimeQA
-
With Just ~20 Lines of Python Code, You can Do ‘Retrieval Augmented GPT Based QA’ Using This Open Source Repository Called PrimeQA
Quick Read: https://www.marktechpost.com/2023/03/03/with-just-20-lines-of-python-code-you-can-do-retrieval-augmented-gpt-based-qa-using-this-open-source-repository-called-primeqa/ Paper: https://arxiv.org/pdf/2301.09715.pdf Github: https://github.com/primeqa/primeqa
pixel
-
Image to Code?
https://arxiv.org/abs/2205.06175 - Gato from DeepMind https://arxiv.org/abs/2207.06991 - An interesting attempt to have a pixel based language model, which should be inherently multimodal
-
[D] Theoretically, could Computer Vision learn language?
PIXEL Language Modelling from Pixels: https://arxiv.org/abs/2207.06991
-
The Grind a Day: thousands of Apple II floppy disks archived
> LLMs do not do this, and if you're training the LLMs (at cost) to do this, you're already having to do the very same searching out of materials within the corpus related to what you want.
A more reasonable suggestion would be not training a LLM (which one doesn't want to do anyway) but treating it as a retrieval+summarization task: search the corpus for mentions and similar-by-embedding documents, and summarize. LLMs are good at abstractive summarization with minimal hallucination or error. This can serve as an 'annotated bibliography', a first pass for a human writing it themselves, or the collective summaries be fed into the LLM for a summary.
The main problem here is I guess that most of the relevant texts have poor or no OCR, so one can't do that in the first place. But there's a good chance that that will mostly stop being an issue in a few years as 'text' LLMs move to images (see eg PIXEL https://arxiv.org/abs/2207.06991 or Kosmos https://arxiv.org/abs/2302.14045 or https://arxiv.org/abs/2010.10648#google https://arxiv.org/abs/2012.14271 https://arxiv.org/abs/2209.14156 ) and they will either OCR, embed, or just process images of complex text directly. So, something to keep an eye on, perhaps: there's never going to be enough humans to do all this archiving properly, but perhaps there may eventually be enough GPUs to do it...
- [D] What is some recent ideas/papers that you find most interesting?
What are some alternatives?
question_extractor - Generate question/answer training pairs out of raw text.
extreme-bert - ExtremeBERT is a toolkit that accelerates the pretraining of customized language models on customized datasets, described in the paper “ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT”.
cherche - Neural Search
OpenChem - OpenChem: Deep Learning toolkit for Computational Chemistry and Drug Design Research
google-local-results-ai-server - A server code for serving BERT-based models for text classification. It is designed by SerpApi for heavy-load prototyping and production tasks, specifically for the implementation of the google-local-results-ai-parser gem.
4cade - 100s of games at your fingertips, as long as your fingertips are on an Apple ][
RATransformers - RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!
SquadCalc - A Minimalist Squad Mortar Calculator
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
llmware - Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
datasets - 🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools