mesh-transformer-jax
dalle-mini
mesh-transformer-jax | dalle-mini | |
---|---|---|
52 | 3,446 | |
6,213 | 14,645 | |
- | - | |
0.0 | 5.2 | |
over 1 year ago | 6 months ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mesh-transformer-jax
-
Large Language Models: Compairing Gen2/Gen3 Models (GPT-3, GPT-J, MT5 and More)
GPT-J is a LLM case study with two goals: Training a LLM with a data source containing unique material, and using the training frameworkMesh Transformer JAX to achieve a high training efficiency through parallelization. There is no research paper about GPT-J, but on its GitHub pages, the model, different checkpoints, and the complete source code for training is given.
-
[R] Parallel Attention and Feed-Forward Net Design for Pre-training and Inference on Transformers
This idea has already been proposed in ViT-22B and GPT-J-6B.
- Show HN: Finetune LLaMA-7B on commodity GPUs using your own text
-
[D] An Instruct Version Of GPT-J Using Stanford Alpaca's Dataset
Sure. Here's the repo I used for the fine-tuning: https://github.com/kingoflolz/mesh-transformer-jax. I used 5 epochs, and appart from that I kept the default parameters in the repo.
- Boss wants me to use ChatGPT for work, but I refuse to input my personal phone number. Any advice?
-
Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy
You can skip to step 4 using something like GPT-J as far as I understand: https://github.com/kingoflolz/mesh-transformer-jax#links
The pretrained model is already available.
-
Best coding model?
The Github repo suggests it's possible you can change the number of checkpoints to make it run on a GPU.
- Ask HN: What language models can I fine-tune at home?
-
selfhosted/ open-source ChatGPT alternative?
GPT-J, which uses mesh-transformer-jax: https://github.com/kingoflolz/mesh-transformer-jax
-
GPT-J, an open-source alternative to GPT-3
They hinted at it in the screenshot, but the goods are linked from the https://6b.eleuther.ai page: https://github.com/kingoflolz/mesh-transformer-jax#gpt-j-6b (Apache 2)
dalle-mini
-
Mini-Gemini: Mining the Potential of Multi-Modality Vision Language Models
Mini-Gemini is a bit of a confusing name.
Reminds me of how DALL·E Mini came out three years ago and eventually had to rename itself to Craiyon https://github.com/borisdayma/dalle-mini
-
New Baby Kitten, what should i Name her?
I wouldnt consider Craiyon to be high tier equipment
-
Annual meatball harvest in southern Italy. Mamma mia. 👌🤌
Made with : https://www.craiyon.com/
- Taylor Swift holding up a novel and reading it aloud in a beautiful library while standing behind a lectern #craiyon
-
AI Eevee
AI Site
- Ai art The Thing
-
Never underestimate a droid: robots gather at AI for Good summit in Geneva
https://www.craiyon.com/ try it, and it's not even that good a text to image generator.
-
So simple, yet I can't get the prompt. Any Idea?
For example used just your prompt on Craiyon , guess you can try on replicate the free sd demos for example.
- Pedi para uma IA genérica fazer uma foto do Fernando Diniz, estes foram os resultados
- MADNESS AI ART
What are some alternatives?
DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
DALLE2-pytorch - Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
tensorflow - An Open Source Machine Learning Framework for Everyone
dalle-2-preview
gpt-neo - An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
latent-diffusion - High-Resolution Image Synthesis with Latent Diffusion Models
jax - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
stable-diffusion - A latent text-to-image diffusion model
KoboldAI-Client
dalle-flow - 🌊 A Human-in-the-Loop workflow for creating HD images from text
alpaca-lora - Instruct-tune LLaMA on consumer hardware
stylegan2-pytorch - Simplest working implementation of Stylegan2, state of the art generative adversarial network, in Pytorch. Enabling everyone to experience disentanglement