TruthfulQA
recurrent-memory-transformer
TruthfulQA | recurrent-memory-transformer | |
---|---|---|
4 | 7 | |
508 | 741 | |
- | - | |
2.8 | 5.9 | |
6 months ago | 11 days ago | |
Jupyter Notebook | Jupyter Notebook | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
TruthfulQA
-
airoboros gpt-4 instructed + context-obedient question answering
Dataset: https://github.com/sylinrl/TruthfulQA
-
Scaling Transformer to 1M tokens and beyond with RMT
this is a great point.
do you know of any benchmarks doing this today?
given the acute need to evaluate models on contextual factuality, we're exploring how to create a benchmark for this purpose but prefer existing benchmarks if possible.
openai's truthfulqa[0] is close but does not focus on contextual factuality and targets a much harder problem of absolute truth.
if none exist, and people are interested in contributing, please reach out.
[0] https://github.com/sylinrl/TruthfulQA
-
[D] Is all the talk about what GPT can do on Twitter and Reddit exaggerated or fairly accurate?
I agree they show that you can brute-force mimick uncertainty estimates to some degree, and that the model is generally well calibrated (though on what is basically a set of trivia questions, so YMMV)... yet:
-
[R] TruthfulQA: Measuring How Models Mimic Human Falsehoods
Code for https://arxiv.org/abs/2109.07958 found: https://github.com/sylinrl/TruthfulQA
recurrent-memory-transformer
-
Scaling Transformer to 1M tokens and beyond with RMT
i find the github link https://github.com/booydar/t5-experiments/tree/scaling-report
Here's a list of tools for scaling up transformer context that have github repos:
* FlashAttention: In my experience, the current best solution for n² attention, but it's very hard to scale it beyond the low tens of thousands of tokens. Code: https://github.com/HazyResearch/flash-attention
* Heinsen Routing: In my experience, the current best solution for n×m attention. I've used it to pull up more than a million tokens as context. It's not a substitute for n² attention. Code: https://github.com/glassroom/heinsen_routing
* RWKV: A sort-of-recurrent model which claims to have performance comparable to n² attention in transformers. In my limited experience, it doesn't. Others agree: https://twitter.com/arankomatsuzaki/status/16390003799784038... . Code: https://github.com/BlinkDL/RWKV-LM
* RMT (this method): I'm skeptical that the recurrent connections will work as well as n² attention in practice, but I'm going to give it a try. Code: https://github.com/booydar/t5-experiments/tree/scaling-repor...
In addition, there's a group at Stanford working on state-space models that looks promising to me. The idea is to approximate n² attention dynamically using only O(n log n) compute. There's no code available, but here's a blog post about it: https://hazyresearch.stanford.edu/blog/2023-03-27-long-learn...
If anyone here has other suggestions for working with long sequences (hundreds of thousands to millions of tokens), I'd love to learn about them.
Checking the actual results: https://github.com/booydar/t5-experiments/blob/a6c478754530cdee2a67974e44a0c1b6dbad92c4/results/babilong.csv, I think it's cute, but not a real breakthrough.
-
Code for Scaling Transformer to 1M tokens and beyond with RMT (arxiv.org)
As all...
https://github.com/booydar/t5-experiments/tree/scaling-repor...
What are some alternatives?
safari - Convolutions for Sequence Modeling
auto-evaluator
flash-attention - Fast and memory-efficient exact attention
heinsen_routing - Reference implementation of "An Algorithm for Routing Vectors in Sequences" (Heinsen, 2022) and "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019), for composing deep neural networks.
JARVIS - JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf