-
> To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license focused on research use cases. Access to the model will be granted on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world. People interested in applying for access can find the link to the application in our research paper.
The closest you are going to get to the source is here: https://github.com/facebookresearch/llama
It is still unclear if your're even going to get access to the entire model. Even if you did, you can't use it for your commercial product anyway.
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
I'm going to assume you know how to stand up and manage a distributed training cluster as a simplifying assumption. Note this is an aggressive assumption.
You would need to replicate the preprocessing steps. Replicating these steps is going to be tricky as they are not described in detail.Then you would need to implement the model using xformers [1]. Using xformers is going to save you a lot of compute spend. You will need to manually implement the backwards pass to reduce recomputation of expensive activations.
The model was trained using 2048 A100 GPUs with 80GBs of VRAM. A single 8 A100 GPU machine from Lambda Cloud costs $12.00/hr [2]. The team from meta used 256 such machines giving you a per day cost of $73,728. It takes 21 days to train this model. The upfront lower bound cost estimate of doing this is [(12.00 * 24) * 21 * 256) = ] $1,548,288 dollars assuming everything goes smoothly and your model doesn't bite it during training. You may be able to negotiate bulk pricing for these types of workloads.
That dollar value is just for the compute resources alone. Given the compute costs required you will probably also want a team composed of ML Ops engineers to monitor the training cluster and research scientists to help you with the preprocessing and model pipelines.
[1] https://github.com/facebookresearch/xformers
-
gpt_index
Discontinued LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. [Moved to: https://github.com/jerryjliu/llama_index]
(creator of gpt index / llamaindex here https://github.com/jerryjliu/gpt_index)
Funny that we had just rebranded our tool from GPT Index to LlamaIndex about a week ago to avoid potential trademark issues with OpenAI, and turns out Meta has similar ideas around LLM+llama puns :). Must mean the name is good though!
Also very excited to try plugging in the LLaMa model into LlamaIndex, will report the results.
-
You mean this code?
https://archive.softwareheritage.org/browse/content/sha1_git...
Do you see that notice at the top of the file? It says:
==
This file is part of Quake III Arena source code.
Quake III Arena source code is free software; you can redistribute it
-
If you're patient, https://github.com/FMInference/FlexGen lets you trade off GPU RAM for system RAM or even disk space.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives