-
If a variable contains batch size, then name it accordingly — batch_size.
And no glossary needed, KISS
https://github.com/johnma2006/mamba-minimal/blob/82efa90919c...
-
Sevalla
Deploy and host your apps and databases, now with $50 credit! Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!
-
The original mamba code has a lot of speed optimizations and other stuff that make it difficult to immediately get so this will help with learning.
I can't help but also plug my own Mamba inference implementation. https://github.com/rbitr/llm.f90/tree/master/ssm
-
heinsen_sequence
Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)
with only two calls to the PyTorch API. See the examples here:
https://github.com/glassroom/heinsen_sequence/blob/main/README.md
-
>"everyone" seems to know Mamba. I never heard of Mamba
Only the "everybody who knows what mamba is" are the ones upvoting and commenting. Think of all the people who ignore it. For me, Mamba is the faster version of Conda [1], and that's why I clicked on the article.
https://github.com/mamba-org/mamba
-
ai-notes
notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under the /Resources folder.
the field just moves fast. I have curated a list of non-hypey writers and youtubers who explain these things for a typical SWE audience if you are interested. https://github.com/swyxio/ai-notes/blob/main/Resources/Good%...
-
curated-transformers
🤖 A PyTorch library of curated Transformer models and their composable components
https://github.com/explosion/curated-transformers/blob/main/...
Llama 1/2:
https://github.com/explosion/curated-transformers/blob/main/...
MPT:
https://github.com/explosion/curated-transformers/blob/main/...
With various stuff enabled, including support for TorchScript JIT, PyTorch flash attention, etc.