Gemma doesn't suck anymore – 8 bug fixes

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

  • Thanks! :) I'm pushing them into transformers, pytorch-gemma and collabing with the Gemma team to resolve all the issues :)

    The RoPE fix should already be in transformers 4.38.2: https://github.com/huggingface/transformers/pull/29285

    My main PR for transformers which fixes most of the issues (some still left): https://github.com/huggingface/transformers/pull/29402

  • gemma_pytorch

    The official PyTorch implementation of Google's Gemma models

  • Here are the missing links:

    * Gemma, a family of open models from Google: https://ai.google.dev/gemma

    * Unsloth is a tool/method for training models faster (IIUC): https://github.com/unslothai/unsloth

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • unsloth

    Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

  • Here are the missing links:

    * Gemma, a family of open models from Google: https://ai.google.dev/gemma

    * Unsloth is a tool/method for training models faster (IIUC): https://github.com/unslothai/unsloth

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts