ai-on-gke | gemma | |
---|---|---|
2 | 3 | |
163 | 2,111 | |
17.2% | 11.5% | |
9.8 | 5.6 | |
2 days ago | 5 days ago | |
Jupyter Notebook | Jupyter Notebook | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ai-on-gke
-
Gemma: New Open Models
There is a lot of work to make the actual infrastructure and lower level management of lots and lots of GPUs/TPUs open as well - my team focuses on making the infrastructure bit at least a bit more approachable on GKE and Kubernetes.
https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main
and
https://github.com/google/xpk (a bit more focused on HPC, but includes AI)
and
https://github.com/stas00/ml-engineering (not associated with GKE, but describes training with SLURM)
The actual training is still a bit of a small pool of very experienced people, but it's getting better. And every day serving models gets that much faster - you can often simply draft on Triton and TensorRT-LLM or vLLM and see significant wins month to month.
- Reference Architecture for ML Training and Batch on GKE with Kueue
gemma
-
What are LLMs? An intro into AI, models, tokens, parameters, weights, quantization and more
Medium models: Roughly between 1B to 10B parameters. This is where Mistral 7B, Phi-3, Gemma from Google DeepMind, and wizardlm2 sit. Fun fact: GPT 2 was a medium sized model, much smaller than its latest versions.
- Gemma – a family of lightweight, state-of-the art open models from Google
-
Gemma: New Open Models
We've documented the architecture (including key differences) in our technical report here (https://goo.gle/GemmaReport), and you can see the architecture implementation in our Git Repo (https://github.com/google-deepmind/gemma).
What are some alternatives?
gemma_pytorch - The official PyTorch implementation of Google's Gemma models
gemma.cpp - lightweight, standalone C++ inference engine for Google's Gemma models.
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.