Brave Leo now uses Mixtral 8x7B as default

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • llama.cpp

    LLM inference in C/C++

  • Nice graphs here: https://github.com/ggerganov/llama.cpp/pull/1684

    So for example, 2 bit version of the 30B is much worse than the original, but still better than the 13B model.

    Also, there are lots of extra details, eg, not all of the weights are 2 bit, and even the 2 bit weights are higher than that overall as groups of quantised weights share scale factors stored elsewhere.

  • Cgml

    GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.

  • Here’s an example of a custom 4 bits/weight codec for ML weights:

    https://github.com/Const-me/Cgml/blob/master/Readme.md#bcml1...

    llama.cpp does it slightly differently but still, AFAIK their quantized data formats are conceptually similar to my codec.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • uBlock

    uBlock Origin - An efficient blocker for Chromium and Firefox. Fast and lean.

  • > It allows for 30,000 dynamic rules

    That is not what we mean by dynamic filters. From https://developer.chrome.com/blog/improvements-to-content-fi...

    > However, to support more frequent updates and user-defined rules, extensions can add rules dynamically too, without their developers having to upload a new version of the extension to the Chrome Web Store.

    What Chrome is talking about is the ability to specify rules at runtime. What critics of Manifest V3 are talking about is not the ability to dynamically add rules (although that can be an issue), it is the ability to add dynamic rules -- ie rules that analyze and rewrite requests in the style of the blockingWebRequest permission.

    It's a little deceptive to claim that the concerns here are outdated and to point to vague terminology that sounds like it's correcting the problem, but on actual inspection turns out to be entirely separate functionality from what the GP was talking about.

    > Giving this ability to extensions can slow down the browser for the user. These ads can still be blocked through other means.

    This is the debate; most of the adblocking community disagrees with this assertion. uBO maintains a list of some common features that are already not possible to support in Chrome ( https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b... ) and has written about features that are not able to be supported via Chrome's current V3 API ( https://github.com/uBlockOrigin/uBOL-home/wiki/Frequently-as... ). Of particular note are filtering for large media elements (I use this a lot on mobile Firefox, it's great for reducing page size), and top-level filtering of domains/fonts.

  • uBOL-home

    uBO Lite home (MV3)

  • > It allows for 30,000 dynamic rules

    That is not what we mean by dynamic filters. From https://developer.chrome.com/blog/improvements-to-content-fi...

    > However, to support more frequent updates and user-defined rules, extensions can add rules dynamically too, without their developers having to upload a new version of the extension to the Chrome Web Store.

    What Chrome is talking about is the ability to specify rules at runtime. What critics of Manifest V3 are talking about is not the ability to dynamically add rules (although that can be an issue), it is the ability to add dynamic rules -- ie rules that analyze and rewrite requests in the style of the blockingWebRequest permission.

    It's a little deceptive to claim that the concerns here are outdated and to point to vague terminology that sounds like it's correcting the problem, but on actual inspection turns out to be entirely separate functionality from what the GP was talking about.

    > Giving this ability to extensions can slow down the browser for the user. These ads can still be blocked through other means.

    This is the debate; most of the adblocking community disagrees with this assertion. uBO maintains a list of some common features that are already not possible to support in Chrome ( https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b... ) and has written about features that are not able to be supported via Chrome's current V3 API ( https://github.com/uBlockOrigin/uBOL-home/wiki/Frequently-as... ). Of particular note are filtering for large media elements (I use this a lot on mobile Firefox, it's great for reducing page size), and top-level filtering of domains/fonts.

  • willow-inference-server

    Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

  • I think this perspective comes from a lack of historical experience and hands-on experience overall.

    Nvidia more broadly has very impressive support for their GPUs. If you look at the support lifecycles for their Jetson hardware over time it's significantly worse. I encourage you to look at what support lifecycles have looked like, with the most "egregious" example being dropping of support for the Jetson Nano in from what I recall was within a couple of years.

    Another consideration - Jetson is optimized for power efficiency/form-factor and on a per $ basis CUDA performance is terrible. The power efficiency and form-factor come at significant cost. See this discussion from one of my projects[0]. I evaluated the use of WIS on an Orin that I have and from what I can recall it was significantly slower than a GTX 1070 which is... Unimpressive.

    In the end what do I care what people use, I'm offering the perspective and experience of someone who has actually used the Jetson line for many years and frequently struggled with all of these issues and more.

    [0] - https://github.com/toverainc/willow-inference-server/discuss...

  • Not so sure about that. Check out https://github.com/ray-project/llmperf-leaderboard

    And try mixtral on chat.groq.com

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts