How big of a jump is 13B Vicuna Uncensored vs 30B Vicuna Uncensored?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • AGiXT

    AGiXT is a dynamic AI Agent Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse AI providers. Combining adaptive memory, smart features, and a versatile plugin system, AGiXT delivers efficient and comprehensive AI solutions.

  • File upload and automatic agents. It exists it is just buggy. They are working at an insane pace building it. It is practically broke 90% of the time. Maybe it's working better right now. I had success with v1.1.31 as well. https://github.com/Josh-xt/AGiXT

  • mpt-lora-patch

    Patch for MPT-7B which allows using and training a LoRA

  • To merge a LoRA into an existing model, use this script:

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • GPTQ-for-LLaMa

    4 bits quantization of LLaMA using GPTQ

  • Then once you have done that, re-quantize the model with GPTQ for Llama. Many models including llama are compatible with the regular triton version. If not, you may have to find a fork that is compatible.

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • If you are using the triton version or my CUDA fork for inference, you can use act-order:

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts