memprompt
unilm
Our great sponsors
memprompt | unilm | |
---|---|---|
4 | 40 | |
320 | 18,319 | |
- | 5.9% | |
1.7 | 9.0 | |
about 1 year ago | 5 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
memprompt
-
Allen Institute for Artificial Intelligence Introduces MemPrompt: A New Method to “fix” GPT-3 After Deployment with User Interaction
Quick Read: https://www.marktechpost.com/2022/12/18/allen-institute-for-artificial-intelligence-introduces-memprompt-a-new-method-to-fix-gpt-3-after-deployment-with-user-interaction/ Paper: https://arxiv.org/abs/2201.06009 Code: https://github.com/madaan/memprompt
-
Building a Virtual Machine Inside ChatGPT
It's already possible to get some of this effect with codex. The trick is to keep appending the interaction in the prompt (to maintain a memory of sorts).
For examples, you can replicate all the prompts here: https://twitter.com/yoavgo/status/1599200756631887872 with prompt + memory.
The notebook at https://github.com/madaan/memprompt/blob/main/YoavsPythonPro... shows a demo of this.
Some of these ideas were earlier discussed in our work on memory-assisted prompting [1].
[1] https://arxiv.org/pdf/2201.06009.pdf.
-
[D] Paper Review Video - Memory-assisted prompt editing to improve GPT-3 after deployment
Code for https://arxiv.org/abs/2201.06009 found: https://github.com/madaan/memprompt
unilm
- The Era of 1-Bit LLMs: Training_Tips, Code And_FAQ [pdf]
- The Era of 1-Bit LLMs: Training Tips, Code and FAQ
-
The Era of 1-bit LLMs: ternary parameters for cost-effective computing
+1 On this, the real proof would have been testing both models side-by-side.
It seems that it may be published on GitHub [1] according to HuggingFace [2].
[1] https://github.com/microsoft/unilm/tree/master/bitnet
[2] https://huggingface.co/papers/2402.17764
- I'm an Old Fart and AI Makes Me Sad
-
On building a semantic search engine
e5-mistral is essentially a distillation from gpt-4 to a smaller model. You can see here https://github.com/microsoft/unilm/blob/16da2f193b9c1dab0a69...
they actually have custom prompts for each dataset being tested.
Question would be, if you haven't seen the task before, what is a good prompt to prepend for your task?
IMO e5-mistral is overfit to MTEB
-
Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide
Layout LM v1, v2 and v3 models [ Github ] DocBERT [ Github ]
-
Microsoft Publishes LongNet: Scaling Transformers to 1,000,000,000 Tokens
The repository is available here.
-
Recommended open LLMs with image input modality?
It is missing kosmos-2. I remember its image captioning was(demo currently down) really good and it's almost as fast as llava and lavin.
-
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Should be this: https://github.com/microsoft/unilm/
-
[R] LongNet: Scaling Transformers to 1,000,000,000 Tokens
This is from Microsoft Research (Asia). https://aka.ms/GeneralAI
What are some alternatives?
gpt-scrolls - A collaborative collection of open-source safe GPT-3 prompts that work well
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.