Our great sponsors
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
willow-inference-server
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
Nice graphs here: https://github.com/ggerganov/llama.cpp/pull/1684
So for example, 2 bit version of the 30B is much worse than the original, but still better than the 13B model.
Also, there are lots of extra details, eg, not all of the weights are 2 bit, and even the 2 bit weights are higher than that overall as groups of quantised weights share scale factors stored elsewhere.
Here’s an example of a custom 4 bits/weight codec for ML weights:
https://github.com/Const-me/Cgml/blob/master/Readme.md#bcml1...
llama.cpp does it slightly differently but still, AFAIK their quantized data formats are conceptually similar to my codec.
> It allows for 30,000 dynamic rules
That is not what we mean by dynamic filters. From https://developer.chrome.com/blog/improvements-to-content-fi...
> However, to support more frequent updates and user-defined rules, extensions can add rules dynamically too, without their developers having to upload a new version of the extension to the Chrome Web Store.
What Chrome is talking about is the ability to specify rules at runtime. What critics of Manifest V3 are talking about is not the ability to dynamically add rules (although that can be an issue), it is the ability to add dynamic rules -- ie rules that analyze and rewrite requests in the style of the blockingWebRequest permission.
It's a little deceptive to claim that the concerns here are outdated and to point to vague terminology that sounds like it's correcting the problem, but on actual inspection turns out to be entirely separate functionality from what the GP was talking about.
> Giving this ability to extensions can slow down the browser for the user. These ads can still be blocked through other means.
This is the debate; most of the adblocking community disagrees with this assertion. uBO maintains a list of some common features that are already not possible to support in Chrome ( https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b... ) and has written about features that are not able to be supported via Chrome's current V3 API ( https://github.com/uBlockOrigin/uBOL-home/wiki/Frequently-as... ). Of particular note are filtering for large media elements (I use this a lot on mobile Firefox, it's great for reducing page size), and top-level filtering of domains/fonts.
> It allows for 30,000 dynamic rules
That is not what we mean by dynamic filters. From https://developer.chrome.com/blog/improvements-to-content-fi...
> However, to support more frequent updates and user-defined rules, extensions can add rules dynamically too, without their developers having to upload a new version of the extension to the Chrome Web Store.
What Chrome is talking about is the ability to specify rules at runtime. What critics of Manifest V3 are talking about is not the ability to dynamically add rules (although that can be an issue), it is the ability to add dynamic rules -- ie rules that analyze and rewrite requests in the style of the blockingWebRequest permission.
It's a little deceptive to claim that the concerns here are outdated and to point to vague terminology that sounds like it's correcting the problem, but on actual inspection turns out to be entirely separate functionality from what the GP was talking about.
> Giving this ability to extensions can slow down the browser for the user. These ads can still be blocked through other means.
This is the debate; most of the adblocking community disagrees with this assertion. uBO maintains a list of some common features that are already not possible to support in Chrome ( https://github.com/gorhill/uBlock/wiki/uBlock-Origin-works-b... ) and has written about features that are not able to be supported via Chrome's current V3 API ( https://github.com/uBlockOrigin/uBOL-home/wiki/Frequently-as... ). Of particular note are filtering for large media elements (I use this a lot on mobile Firefox, it's great for reducing page size), and top-level filtering of domains/fonts.
I think this perspective comes from a lack of historical experience and hands-on experience overall.
Nvidia more broadly has very impressive support for their GPUs. If you look at the support lifecycles for their Jetson hardware over time it's significantly worse. I encourage you to look at what support lifecycles have looked like, with the most "egregious" example being dropping of support for the Jetson Nano in from what I recall was within a couple of years.
Another consideration - Jetson is optimized for power efficiency/form-factor and on a per $ basis CUDA performance is terrible. The power efficiency and form-factor come at significant cost. See this discussion from one of my projects[0]. I evaluated the use of WIS on an Orin that I have and from what I can recall it was significantly slower than a GTX 1070 which is... Unimpressive.
In the end what do I care what people use, I'm offering the perspective and experience of someone who has actually used the Jetson line for many years and frequently struggled with all of these issues and more.
[0] - https://github.com/toverainc/willow-inference-server/discuss...
Not so sure about that. Check out https://github.com/ray-project/llmperf-leaderboard
And try mixtral on chat.groq.com