-
llama-dl
Discontinued High-speed download of LLaMA, Facebook's 65B parameter GPT model [UnavailableForLegalReasons - Repository access blocked]
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
"We unfortunately aren't getting any additional gains from lazy page loading, since this is a dense model. To generate a single token, every single page in the model file needs to be loaded. What this means is that first runs that load from spinning disk are still going to be slow, even though the average case has greatly improved"
https://github.com/ggerganov/llama.cpp/issues/91#issuecommen...
Try this: it installs with a simple python command if you have an NVIDIA GPU: https://github.com/kuleshov/minillm