-
PaLM-rlhf-pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Closest you can get is probably with Google T5-Flan [1].
It is not the size of the model or the text it was trained on that makes ChatGPT so performant. It is the additional human assisted training to make it respond well to instructions. Open source versions of that are just starting to see the light of day [2].
[1] https://huggingface.co/google/flan-t5-xxl
[2] https://github.com/lucidrains/PaLM-rlhf-pytorch
what about https://github.com/karpathy/nanoGPT ?
In terms of quality, I think BLOOMZ, or mT0, are the best open-source ones.
The non-finetuned BLOOM does not appear favorably (in English) compared to GLM or OPT, which both have published weights: https://crfm.stanford.edu/helm/v0.1.0/?group=mmlu
> Is the future going to be controlled by big corporations who own the models themselves?
On this subject, there is an effort stemming from BigScience to build an open, distributed inference network, so that people that don’t have enough GPUs at home can contribute theirs and get text generation at one word per second: https://github.com/bigscience-workshop/petals#how-does-it-wo...
> vs the serverside systems
I believe this runs client side, but whether it counts as open source is likely open for debate:
https://github.com/ggerganov/whisper.cpp
> what they have is open source
How is https://github.com/openai/whisper not open source?