Run Mistral 7B on M1 Mac

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llamafile

35 15,120 9.6 C++

Distribute and run LLMs with a single file.
Cgml

22 39 8.6 C++

GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.

Windows equivalent: https://github.com/Const-me/Cgml/tree/master/Mistral/Mistral...
Runs on GPUs, uses about 5GB VRAM. On integrated GPUs generates 1-2 tokens/second, on discrete ones often over 20 tokens/second.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
ollama-webui

14 5,789 9.8 Svelte

Discontinued ChatGPT-Style WebUI for LLMs (Formerly Ollama WebUI) [Moved to: https://github.com/open-webui/open-webui]
ollama

204 64,536 9.9 Go

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

https://ollama.ai/
Very surprised no one else has said it.
If you prefer web UI:

llama.cpp

776 57,463 10.0 C++

LLM inference in C/C++

One thing that's worth mentioning about llama.cpp wrappers like ollama, LM Studio and Faraday is that they don't yet support[1] sliding window attention, and instead use vanilla causal attention from llama2. As noted in the Mistral 7B paper[2], SWA has some benefits in terms of attention span over regular causal attention.
[1]: https://github.com/ggerganov/llama.cpp/issues/3377

OmniQuant

4 572 7.7 Python

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Not on iOS. On macOS, I personally think WizardLM 13B v1.2 is a very strong model and keep hearing good things about it from users on our discord and in support emails. Now that there's OmniQuant support for Mixtral models[1], I'm plan to add support for Mixtral-8x7B-Instruct-v0.1 in the next version of the macOS app, which in my tests, looks like a very good all purpose model that's also pretty good at coding. It's pretty memory hungry (~41GB of RAM), but that's the price to pay for an uncompromising implementation. Existing quantized implementations quantize the MoE gates, leading to a significant drop in perplexity when compared with results from fp16 inference.
[1]: https://github.com/OpenGVLab/OmniQuant/commit/798467

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

OpenAI's New Strategy

2 projects | /r/ChatGPTPro | 9 Dec 2023
Ollama is INSANE - Install custom GPTs within seconds! [Video Tutorial]

1 project | /r/chatgpt_newtech | 16 Nov 2023
Ollama-Webui: ChatGPT-Style Responsive Chat Web UI Client (GUI) for Ollama

1 project | news.ycombinator.com | 11 Nov 2023
Run Large and Small Language Models locally with ollama

2 projects | dev.to | 7 May 2024
Run copilot locally

3 projects | dev.to | 15 Apr 2024

Run Mistral 7B on M1 Mac

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
ollama ollama-interface ollama-ui ollama-web ollama-web-interface
Post date: 16 Dec 2023

llamafile

Cgml

InfluxDB

ollama-webui

ollama

llama.cpp

OmniQuant

Related posts

OpenAI's New Strategy

Ollama is INSANE - Install custom GPTs within seconds! [Video Tutorial]

Ollama-Webui: ChatGPT-Style Responsive Chat Web UI Client (GUI) for Ollama

Run Large and Small Language Models locally with ollama

Run copilot locally

Run Mistral 7B on M1 Mac

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com ollama ollama-interface ollama-ui ollama-web ollama-web-interface Post date: 16 Dec 2023

llamafile

Cgml

InfluxDB

ollama-webui

ollama

llama.cpp

OmniQuant

Related posts

OpenAI's New Strategy

Ollama is INSANE - Install custom GPTs within seconds! [Video Tutorial]

Ollama-Webui: ChatGPT-Style Responsive Chat Web UI Client (GUI) for Ollama

Run Large and Small Language Models locally with ollama

Run copilot locally

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
ollama ollama-interface ollama-ui ollama-web ollama-web-interface
Post date: 16 Dec 2023