gpt-neo
website
gpt-neo | website | |
---|---|---|
82 | 3 | |
6,158 | 7 | |
- | - | |
7.3 | 0.0 | |
about 2 years ago | over 2 years ago | |
Python | CSS | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gpt-neo
-
How Open is Generative AI? Part 2
By December 2020, EleutherAI had introduced The Pile, a comprehensive text dataset designed for training models. Subsequently, tech giants such as Microsoft, Meta, and Google used this dataset for training their models. In March 2021, they revealed GPT-Neo, an open-source model under Apache 2.0 license, which was unmatched in size at its launch. EleutherAI’s later projects include the release of GPT-J, a 6 billion parameter model, and GPT-NeoX, a 20 billion parameter model, unveiled in February 2022. Their work demonstrates the viability of high-quality open-source AI models.
-
Creating an open source chat bot like ChatGPT for my own dataset without GPU?
Yeah, if that is your requirement you should definitely ignore chatterbot, as its older and probably not what your teacher wants. I'm looking at the gpt-neo docs right now: https://github.com/EleutherAI/gpt-neo
-
Any real competitor to GPT-3 which is open source and downloadable?
3.) EleutherAI's GPT-Neo and GPT-NeoX: EleutherAI is an independent research organization that aims to promote open research in artificial intelligence. They have released GPT-Neo, an open-source language model based on the GPT architecture, and are developing GPT-NeoX, a highly-scalable GPT-like model. You can find more information on their GitHub repositories: GPT-Neo: https://github.com/EleutherAI/gpt-neo GPT-NeoX: https://github.com/EleutherAI/gpt-neox
-
⚡ Neural - AI Code Generation for Vim
This is one of the first comprehensive plugins that has been rewritten to support multiple AI backends such as OpenAI GPT3+ and other custom sources in the future such as ChatGPT, GPT-J, GPT-neo and more.
-
Looks like some Taliban fighters are getting burnt out working the 9-5 grind
GPT-Neo is newer than GPT-2 on the open source side of things. In my experience, it tends to give longer and more creative responses than GPT-2 but not on the level of GPT-3. I've not tried GPT-J or GPT-NeoX, but they're also open source and reportedly better than GPT-Neo (albeit less accessible).
- H3 - a new generative language models that outperforms GPT-Neo-2.7B with only *2* attention layers! In H3, the researchers replace attention with a new layer based on state space models (SSMs). With the right modifications, they find that it can outperform transformers.
- First Open Source Alternative to ChatGPT Has Arrived
-
Where is the line for AI and where does ChatGPT stand?
Finally, yes-- it is trained via masked language modeling (text prediction). The approach has been fairly standard for years- the big difference with the GPT* models is the number of paramaters and volume of text-- we still haven't reached a ceiling with LLM parameters- they appear to keep improving with size. This training allows the model to learn a strong representation of language. Their training approach is published and open-source GPT* versions have already been made and released (https://github.com/EleutherAI/gpt-neo). However, the models are huge and can't be run locally for hobbyists. This gets at larger issues in democratization of ML.
- Using the GPT-3 AI Writer inside Obsidian(This is COOL)
-
Teaser trailer for "The Diary of Sisyphus" (2023), the world's first feature film written by an artificial intelligence (GPT-NEO) and produced Briefcase Films, my indie film studio based in Northern Italy
- GPT-Neo 2.7B, released Mar/2021, and unmaintained/unsupported as of Aug/2021? or;
website
- How do I get started with Jax on TPU VMs
-
GPT-J “the open source cousin of GPT-3 everyone can use”
Your view here is entirely reasonable. It was my view before I ever heard about TFRC. I was every bit as skeptical.
That view is wrong. From https://github.com/shawwn/website/blob/master/jaxtpu.md :
> So we're talking about a group of people who are the polar opposite of any Google support experience you may have had.
> Ever struggle with GCP support? They took two weeks to resolve my problem. During the whole process, I vividly remember feeling like, "They don't quite seem to understand what I'm saying... I'm not sure whether to be worried."
> Ever experience TFRC support? I've been a member for almost two years. I just counted how many times they failed to come through for me: zero times. And as far as I can remember, it took less than 48 hours to resolve whatever issue I was facing.
> For a Google project, this was somewhere between "space aliens" and "narnia" on the Scale of Surprising Things.
[...]
> My goal here is to finally put to rest this feeling that everyone has. There's some kind of reluctance to apply to TFRC. People always end up asking stuff like this:
> "I'm just a university student, not an established researcher. Should I apply?"
> Yes!
> "I'm just here to play around a bit with TPUs. I don't have any idea what I'm doing, but I'll poke around a bit and see what's up. Should I apply?"
> Heck yeah!
> "I have a Serious Research Project in mind. I'd like to evaluate whether the Cloud TPU VM platform is sufficient for our team's research goals. Should I apply?"
> Absolutely. But whoever you are, you've probably applied by now. Because everyone is realizing that TFRC is how you accomplish your research goals.
I expect that if you apply, you'll get your activation email within a few hours. Of course, you better get in quick. My goal here was to cause a stampede. Right now, in my experience, you'll be up and running by tomorrow. But if ten thousand people show up from HN, I don't know if that will remain true. :)
I feel a bit bad to be talking at length at TFRC. But then I remembered that none of this is off-topic in the slightest. GPT-J was proof of everything above. No TFRC, no GPT-J. The whole reason that the world can enjoy GPT-J now is because anyone can show up and start doing as many effective things as you can possibly learn.
It was all thanks to TFRC, the Cloud TPU team, the JAX team, the XLA compiler team -- hundreds of people, who have all managed to gift us this amazing opportunity. Yes, they want to win the ML mindshare war. But they know the way to win it is to care deeply about helping you achieve every one of your research goals.
Think of it like a side hobby. Best part is, it's free. (Just watch out for the egress bandwidth, ha. Otherwise you'll be talking with GCP support for your $500 refund -- and yes, that's an unpleasant experience.)
What are some alternatives?
gpt-neox - An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
mesh-transformer-jax - Model parallel transformers in JAX and Haiku
haystack - :mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
helpmecode - Augmented Intelligence Programming
openchat - OpenChat: Easy to use opensource chatting framework via neural networks
swarm-jax - Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes
tensorflow - An Open Source Machine Learning Framework for Everyone
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
lm-evaluation-harness - A framework for few-shot evaluation of language models.
aitextgen - A robust Python tool for text-based AI training and generation using GPT-2.
gpt-2 - Code for the paper "Language Models are Unsupervised Multitask Learners"