[P] LoRA adapter switching at runtime to enable Base model to inherit multiple personalities

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • fastLLaMa

    fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backend.

  • Why this repo and how are we different from other wrappers? Previously someone had asked this in the other post. Thought I would address it here as well. I am really excited to see many people building on top of llama.cpp and I think it deserves all the credit that it is getting. It's inspiring to see how it is shaping out to be a mature framework. However we decided to not simply build the same features in python, but instead focus of features that tackle problems that I personally face at my day job where I run mid to large sized models in production. A lot of the features might or might not make sense to the main repo but we are always looking for features that we can implement in the main repo as it benefits the community as a whole. Here is a more detailed answer if anyone is interested.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts