AI21 Labs Unveils Jamba: The First Production-Grade Mamba-Based AI Model

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • MemGPT

    Building persistent LLM agents with long-term memory 📚🦙

  • On a side note: working over longer contexts also reminds me of MemGPT(https://github.com/cpacker/MemGPT)

  • llama.cpp

    LLM inference in C/C++

  • llama.cpp probably won't be getting Jamba support anytime soon: https://github.com/ggerganov/llama.cpp/issues/6372#issuecomm...

    There is an MLX Mamba implementation, but nothing for Jamba either: https://github.com/alxndrTL/mamba.py/tree/main/mlx

    You could run PyTorch on CPU and w/ a 12B activation pass, it might even run relatively fast (8 tok/s?), but a q4 quant would also easily fit on 2x3090s and should run at >60 tok/s.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • mamba.py

    A simple and efficient Mamba implementation in PyTorch and MLX.

  • llama.cpp probably won't be getting Jamba support anytime soon: https://github.com/ggerganov/llama.cpp/issues/6372#issuecomm...

    There is an MLX Mamba implementation, but nothing for Jamba either: https://github.com/alxndrTL/mamba.py/tree/main/mlx

    You could run PyTorch on CPU and w/ a 12B activation pass, it might even run relatively fast (8 tok/s?), but a q4 quant would also easily fit on 2x3090s and should run at >60 tok/s.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts