YaLM-100B vs DALLE-mtf

YaLM-100B

Pretrained language model with 100B parameters (by yandex)

Suggest topics

Source Code

Suggest alternative

Edit details

DALLE-mtf

Open-AI's DALL-E for large scale training in mesh-tensorflow. (by EleutherAI)

Artificial intelligence Transformers multimodal text-to-image variational-autoencoder autoregressive

Source Code

eleuther.ai

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

YaLM-100B		DALLE-mtf
	Project
35	Mentions	41
3,722	Stars	436
0.1%	Growth	0.2%
0.0	Activity	0.0
10 months ago	Latest Commit	about 2 years ago
Python	Language	Python
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

YaLM-100B

Posts with mentions or reviews of YaLM-100B. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-04-19.

Elon Musk's Grok Exactly Echoes ChatGPT Responses: Identical Answers Raise Questions - EconoTimes
1 project | /r/RealTesla | 10 Dec 2023

Its probably just open source software/training sets repurposed... https://github.com/yandex/YaLM-100B
OpenAI CEO suggests international agency like UN's nuclear watchdog could oversee AI
1 project | /r/ChatGPT | 7 Jun 2023
A few less Googleable questions about local LLMs
4 projects | /r/LocalLLaMA | 19 Apr 2023

There is a 100b model published on pache 2.0 license. Though there is no information about finetuning it or using in 4-bit with smth like llama.cpp. Trying to figure out how to try it without renting extremely expensive gpu set. https://github.com/yandex/YaLM-100B
Is it possible to use llama.cpp or create Alpaca Lora for YALM-100b model?
1 project | /r/LocalLLaMA | 17 Apr 2023

Hey everyone! I just discovered an open-source 100 billion parameter language model called YaLM, which is published under the Apache 2.0 license. The model is trained on more than 1 TB of Russian and English text. Here's the GitHub repo: https://github.com/yandex/YaLM-100B and an article explaining how it was trained: https://medium.com/yandex/yandex-publishes-yalm-100b-its-the-largest-gpt-like-neural-network-in-open-source-d1df53d0e9a6
Kandinsky 2.1 - a new open source text-to-Image model
2 projects | /r/StableDiffusion | 4 Apr 2023

Yandex has already released a LLM: https://github.com/yandex/YaLM-100B
Just another casualty...
2 projects | /r/bing | 18 Feb 2023

So there is this open project YaLM 100B require 200 GB of disk space, it is trained on 1.7 TB of text
There's a lot of news about American/European AI. Do we know anything about what China, India, Russia and other countries are up to?
1 project | /r/singularity | 4 Feb 2023
Suggestion. Chat mode.
1 project | /r/NovelAi | 21 Jan 2023

You'd think so, but to train a model like the one CAI uses, it would require truly jaw-breaking amount of funds. That's why CAI is so suspicious tbh. Just to give you an example, YaML (100 billion parameters which is probably less than CAI) took 65 days to train, and 800 A100 graphics cards. 175 billion parameters would not be 1.75 times higher because it's not a linear function. It would probably be 10x or even more. IIRC, "Open"Ai could only afford to train GPT-3 a single time...
Ask HN: Can I download GPT / ChatGPT to my desktop?
4 projects | news.ycombinator.com | 29 Dec 2022

I don't much follow AI news beyond what I randomly happen to see on HN, but this might still be the largest open source model: https://github.com/yandex/YaLM-100B . There's discussion of it here: https://old.reddit.com/r/MachineLearning/comments/vpn0r1/d_h... - at the bottom of that page is a comment from someone who actually ran it in the cloud.
[Rant] Siri is beyond horrendous and it’s even worse than ever
1 project | /r/apple | 19 Dec 2022

Hilariously, Yandex Alisa runs circles around it, because it's not just a collection of gimmicks but has an actual 100B-class language model (YaLM, opensourced) as its core, plus lots of decent engineering. It's helpful, skillful and feels alive, almost like ChatGPT.

DALLE-mtf

Posts with mentions or reviews of DALLE-mtf. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-19.

How Open is Generative AI? Part 2
8 projects | dev.to | 19 Dec 2023

This vision is in line with EleutherAI, a non-profit organization founded in July 2020 by a group of researchers. Driven by the perceived opacity and the challenge of reproducibility in AI, their goal was to create leading open-source language models.
The open source learning curve for AI researchers
1 project | news.ycombinator.com | 20 Jul 2023
EleutherAI: Empowering Open-Source Artificial Intelligence Research
1 project | news.ycombinator.com | 11 Jul 2023
Seeking advice on fine-tuning Pythia for semantic search in a non-English language
1 project | /r/learnmachinelearning | 23 May 2023

My current idea is to utilize the EleutherAI pythia (Databricks Dolly). I would like to know whether translating the Dolly-15k dataset into the desired language using state-of-the-art translation techniques like DeepL would be a viable approach to fine-tune the Pythia base model. I want to use this model for semantic search, so perfection is not a necessity.
Does anyone want to collaborate to make anti-capitalist AI?
1 project | /r/antiwork | 17 May 2023

There are open source AI efforts, like EleutherAI. Needless to say, they are lagging behind big players, but it's better than nothing.
ChatGPT is bonkers.
1 project | /r/Praise_AI_Overlords | 21 Apr 2023

The new GPT 3.5 isn't aware what are GPT-3.5 or davinci-002 (repeatable) and claimed that it was designed by EleutherAI and has only 6 bil parameters (wasn't been able to repeat but didn't really try).
My teacher has falsely accused me of using ChatGPT to use an assignment.
1 project | /r/ChatGPT | 18 Apr 2023

Hi, my name is Stella Biderman and I run EleutherAI, the one of the foremost non-profit research institutes in the world that trains and studies large language models. I have been involved with the majority of models to hold the title “largest open source GPT model in the world” and have dabbled in exploring using plagiarism detection tools to identify code written by GPT-J.
dolly-v2-12b
3 projects | /r/LocalLLM | 13 Apr 2023

dolly-v2-12bis a 12 billion parameter causal language model created by Databricks that is derived from EleutherAI’s Pythia-12b and fine-tuned on a ~15K record instruction corpus generated by Databricks employees and released under a permissive license (CC-BY-SA)
Futurism: "The Company Behind Stable Diffusion Appears to Be At Risk of Going Under"
7 projects | /r/StableDiffusion | 7 Apr 2023

It is true that Emad needs to find an appropriate business model. The good news is that the hype is still undergoing. I'm sure that Emad can grab another round of liquidity injection. He got plenty of resources. Remember he is also from the finance industry. He got https://www.eleuther.ai/ which can supply a secured, in-house custom LLM equivalent to bloombergGPT.
How can AI be used to protect against exploitative use of other AI?
1 project | /r/WorkReform | 1 Apr 2023

By promoting fully open-source AI, i.e. making datasets, models, methodology and codebases freely available and transparent. What OpenAI claimed to be aiming for, basically.

What are some alternatives?

When comparing YaLM-100B and DALLE-mtf you can also consider the following projects:

gpt-neox - An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

VQGAN-CLIP - Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

SLIDE

CLIP-Guided-Diffusion - Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

dalle-mini - DALL·E Mini - Generate images from a text prompt

mesh-transformer-jax - Model parallel transformers in JAX and Haiku

big-sleep - A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun

YaLM-100B - Pretrained language model with 100B parameters

gpt-3 - GPT-3: Language Models are Few-Shot Learners

ClickHouse - ClickHouse® is a free analytics DBMS for big data

DALLE-pytorch - Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

YaLM-100B vs gpt-neox DALLE-mtf vs VQGAN-CLIP YaLM-100B vs SLIDE DALLE-mtf vs CLIP-Guided-Diffusion YaLM-100B vs NeMo DALLE-mtf vs dalle-mini YaLM-100B vs mesh-transformer-jax DALLE-mtf vs big-sleep YaLM-100B vs YaLM-100B DALLE-mtf vs gpt-3 YaLM-100B vs ClickHouse DALLE-mtf vs DALLE-pytorch

Compare YaLM-100B vs DALLE-mtf and see what are their differences.

YaLM-100B

DALLE-mtf

YaLM-100B

DALLE-mtf

What are some alternatives?