SaaSHub helps you find the best software and product alternatives Learn more →
Pixel Alternatives
Similar projects and alternatives to pixel
-
extreme-bert
ExtremeBERT is a toolkit that accelerates the pretraining of customized language models on customized datasets, described in the paper “ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT”.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
RATransformers
RATransformers 🐭- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!
-
primeqa
The prime repository for state-of-the-art Multilingual Question Answering research and development.
-
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
trankit
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
pixel reviews and mentions
-
Image to Code?
https://arxiv.org/abs/2205.06175 - Gato from DeepMind https://arxiv.org/abs/2207.06991 - An interesting attempt to have a pixel based language model, which should be inherently multimodal
-
[D] Theoretically, could Computer Vision learn language?
PIXEL Language Modelling from Pixels: https://arxiv.org/abs/2207.06991
-
The Grind a Day: thousands of Apple II floppy disks archived
> LLMs do not do this, and if you're training the LLMs (at cost) to do this, you're already having to do the very same searching out of materials within the corpus related to what you want.
A more reasonable suggestion would be not training a LLM (which one doesn't want to do anyway) but treating it as a retrieval+summarization task: search the corpus for mentions and similar-by-embedding documents, and summarize. LLMs are good at abstractive summarization with minimal hallucination or error. This can serve as an 'annotated bibliography', a first pass for a human writing it themselves, or the collective summaries be fed into the LLM for a summary.
The main problem here is I guess that most of the relevant texts have poor or no OCR, so one can't do that in the first place. But there's a good chance that that will mostly stop being an issue in a few years as 'text' LLMs move to images (see eg PIXEL https://arxiv.org/abs/2207.06991 or Kosmos https://arxiv.org/abs/2302.14045 or https://arxiv.org/abs/2010.10648#google https://arxiv.org/abs/2012.14271 https://arxiv.org/abs/2209.14156 ) and they will either OCR, embed, or just process images of complex text directly. So, something to keep an eye on, perhaps: there's never going to be enough humans to do all this archiving properly, but perhaps there may eventually be enough GPUs to do it...
- [D] What is some recent ideas/papers that you find most interesting?
-
A note from our sponsor - SaaSHub
www.saashub.com | 4 May 2024
Stats
xplip/pixel is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of pixel is Python.
Sponsored