Brave Leo now uses Mixtral 8x7B as default

Our great sponsors

SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

Our great sponsors

llama.cpp

769 55,846 10.0 C++

LLM inference in C/C++

Nice graphs here: https://github.com/ggerganov/llama.cpp/pull/1684
So for example, 2 bit version of the 30B is much worse than the original, but still better than the 13B model.
Also, there are lots of extra details, eg, not all of the weights are 2 bit, and even the 2 bit weights are higher than that overall as groups of quantised weights share scale factors stored elsewhere.

Cgml

21 37 8.6 C++

GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.

Here’s an example of a custom 4 bits/weight codec for ML weights:
https://github.com/Const-me/Cgml/blob/master/Readme.md#bcml1...
llama.cpp does it slightly differently but still, AFAIK their quantized data formats are conceptually similar to my codec.

SurveyJS

surveyjs.io sponsored

Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
uBlock

2,992 43,007 9.9 JavaScript

uBlock Origin - An efficient blocker for Chromium and Firefox. Fast and lean.

> It allows for 30,000 dynamic rules
That is not what we mean by dynamic filters. From https://developer.chrome.com/blog/improvements-to-content-fi...
> However, to support more frequent updates and user-defined rules, extensions can add rules dynamically too, without their developers having to upload a new version of the extension to the Chrome Web Store.
What Chrome is talking about is the ability to specify rules at runtime. What critics of Manifest V3 are talking about is not the ability to dynamically add rules (although that can be an issue), it is the ability to add dynamic rules -- ie rules that analyze and rewrite requests in the style of the blockingWebRequest permission.
It's a little deceptive to claim that the concerns here are outdated and to point to vague terminology that sounds like it's correcting the problem, but on actual inspection turns out to be entirely separate functionality from what the GP was talking about.
> Giving this ability to extensions can slow down the browser for the user. These ads can still be blocked through other means.
This is the debate; most of the adblocking community disagrees with this assertion. uBO maintains a list of some common features that are already not possible to support in Chrome ( https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b... ) and has written about features that are not able to be supported via Chrome's current V3 API ( https://github.com/uBlockOrigin/uBOL-home/wiki/Frequently-as... ). Of particular note are filtering for large media elements (I use this a lot on mobile Firefox, it's great for reducing page size), and top-level filtering of domains/fonts.

uBOL-home

16 348 8.4 JavaScript

uBO Lite home (MV3)

> It allows for 30,000 dynamic rules
That is not what we mean by dynamic filters. From https://developer.chrome.com/blog/improvements-to-content-fi...
> However, to support more frequent updates and user-defined rules, extensions can add rules dynamically too, without their developers having to upload a new version of the extension to the Chrome Web Store.
What Chrome is talking about is the ability to specify rules at runtime. What critics of Manifest V3 are talking about is not the ability to dynamically add rules (although that can be an issue), it is the ability to add dynamic rules -- ie rules that analyze and rewrite requests in the style of the blockingWebRequest permission.
It's a little deceptive to claim that the concerns here are outdated and to point to vague terminology that sounds like it's correcting the problem, but on actual inspection turns out to be entirely separate functionality from what the GP was talking about.
> Giving this ability to extensions can slow down the browser for the user. These ads can still be blocked through other means.
This is the debate; most of the adblocking community disagrees with this assertion. uBO maintains a list of some common features that are already not possible to support in Chrome ( https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b... ) and has written about features that are not able to be supported via Chrome's current V3 API ( https://github.com/uBlockOrigin/uBOL-home/wiki/Frequently-as... ). Of particular note are filtering for large media elements (I use this a lot on mobile Firefox, it's great for reducing page size), and top-level filtering of domains/fonts.

willow-inference-server

7 316 8.3 Python

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

I think this perspective comes from a lack of historical experience and hands-on experience overall.
Nvidia more broadly has very impressive support for their GPUs. If you look at the support lifecycles for their Jetson hardware over time it's significantly worse. I encourage you to look at what support lifecycles have looked like, with the most "egregious" example being dropping of support for the Jetson Nano in from what I recall was within a couple of years.
Another consideration - Jetson is optimized for power efficiency/form-factor and on a per $ basis CUDA performance is terrible. The power efficiency and form-factor come at significant cost. See this discussion from one of my projects[0]. I evaluated the use of WIS on an Orin that I have and from what I can recall it was significantly slower than a GTX 1070 which is... Unimpressive.
In the end what do I care what people use, I'm offering the perspective and experience of someone who has actually used the Jetson line for many years and frequently struggled with all of these issues and more.
[0] - https://github.com/toverainc/willow-inference-server/discuss...

llmperf-leaderboard

6 378 5.6

Not so sure about that. Check out https://github.com/ray-project/llmperf-leaderboard
And try mixtral on chat.groq.com

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Apr 24th is JavaScript Naked Day – Browse the web without JavaScript
1 project | news.ycombinator.com | 21 Apr 2024
Some notes on Firefox's media autoplay settings in practice as of Firefox 124
2 projects | news.ycombinator.com | 30 Mar 2024
X.org Server Clears Out Remnants for Supporting Old Compilers
2 projects | news.ycombinator.com | 21 Feb 2024
Mozilla thinks Apple, Google, Microsoft should play fair
2 projects | news.ycombinator.com | 27 Jan 2024
uBlock Origin – 1.55.0
1 project | news.ycombinator.com | 3 Jan 2024

Brave Leo now uses Mixtral 8x7B as default

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
ublock-origin Cuda Blocker Deep Learning Firefox
Post date: 27 Jan 2024

llama.cpp

Cgml

SurveyJS

uBlock

uBOL-home

willow-inference-server

llmperf-leaderboard

Related posts

Brave Leo now uses Mixtral 8x7B as default

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com ublock-origin Cuda Blocker Deep Learning Firefox Post date: 27 Jan 2024

llama.cpp

Cgml

SurveyJS

uBlock

uBOL-home

willow-inference-server

llmperf-leaderboard

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
ublock-origin Cuda Blocker Deep Learning Firefox
Post date: 27 Jan 2024