WizardLM-13B-Uncensored

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • Thank you for bringing that to my attention ! I can't (without starving to death) spend more than around 100 until i can afford another real computer. I guess i'll poke around and check anyway this part about "docker". However i'll need to poke around since : https://github.com/oobabooga/text-generation-webui Mention that i should be using " TORCH_CUDA_ARCH_LIST" Based on my gpu and i have no knowledge what is the replacement for my poor's man GPU intel graphic.

  • mlc-llm

    Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

  • I run vicuna-7b in browser on my MacBook Pro M1 via https://github.com/mlc-ai/mlc-llm

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • koboldcpp

    A simple one-file way to run various GGML and GGUF models with KoboldAI's UI

  • As far as I know, you only need a single ggml .bin file for CPU inference. I use koboldcpp and it's just drag&drop .bin on top of .exe to make it work.

  • llama.cpp

    LLM inference in C/C++

  • Of course, the bigger the model, the longer it takes. 7B q5_1 generations take about 400-450 ms/Token, 13B q5_1 about 700-800 ms/T. Thanks to a flood of optimizations, things have been improving steadily, and stuff like Proof of concept: GPU-accelerated token generation will soon provide another much needed and welcome boost.

  • WizardVicunaLM

    LLM that combines the principles of wizardLM and vicunaLM

  • ggml

    Tensor library for machine learning

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Ai on a android phone?

    2 projects | /r/LocalLLaMA | 8 Dec 2023
  • MLC vs llama.cpp

    2 projects | /r/LocalLLaMA | 7 Nov 2023
  • [Project] Scaling LLama2 70B with Multi NVIDIA and AMD GPUs under 3k budget

    1 project | /r/LocalLLaMA | 21 Oct 2023
  • Scaling LLama2-70B with Multi Nvidia/AMD GPU

    2 projects | news.ycombinator.com | 19 Oct 2023
  • ROCm Is AMD's #1 Priority, Executive Says

    5 projects | news.ycombinator.com | 26 Sep 2023