triton vs web-llm

triton

Development repository for the Triton language and compiler (by openai)

Suggest topics

Source Code

triton-lang.org

Suggest alternative

Edit details

web-llm

Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support. (by mlc-ai)

Deep Learning llm tvm Webgpu webml chatgpt language-model

Source Code

mlc.ai

Suggest alternative

Edit details

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

triton		web-llm
	Project
30	Mentions	42
10,981	Stars	9,102
7.9%	Growth	5.1%
9.9	Activity	9.1
3 days ago	Latest Commit	4 days ago
C++	Language	TypeScript
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

triton

Posts with mentions or reviews of triton. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-09-26.

OpenAI Triton: language and compiler for highly efficient Deep-Learning
1 project | news.ycombinator.com | 28 Feb 2024
Show HN: Ollama for Linux – Run LLMs on Linux with GPU Acceleration
14 projects | news.ycombinator.com | 26 Sep 2023

There's a ton of cool opportunity in the runtime layer. I've been keeping my eye on the compiler-based approaches. From what I've gathered many of the larger "production" inference tools use compilers:
- https://github.com/openai/triton
Core Functionality for AMD #1983
1 project | news.ycombinator.com | 1 Sep 2023
Project name easily confused with Nvidia triton
1 project | news.ycombinator.com | 7 Aug 2023
Nvidia's CUDA Monopoly
3 projects | news.ycombinator.com | 7 Aug 2023

Does anyone have more inside knowledge from OpenAI or AMD on AMDGPU support for Triton?
I see this:
https://github.com/openai/triton/issues/1073
But it's not clear to me if we will see AMD GPUs as first class citizens for pytorch in the future?
@soumithchintala (Cofounded and lead @PyTorch at Meta) on Twitter: I'm fairly puzzled by $NVDA skyrocketing... (cont.)
1 project | /r/AMD_Stock | 31 May 2023
The tiny corp raised $5.1M
3 projects | news.ycombinator.com | 25 May 2023

I thought this was a good overview of the idea Triton can circumvent the CUDA moat: https://www.semianalysis.com/p/nvidiaopenaitritonpytorch
It also looks like they added MLIR backend to Triton though I wonder if Mojo has advantages since it was built on MLIR? https://github.com/openai/triton/pull/1004
Anyone hosting a local LLM server
7 projects | /r/Oobabooga | 20 May 2023

I'm pretty happy with the setup, because it allows me to keep all the AI stuff and its dozens of conda envs and repos etc. seperate from my normal setup and "portable". It may have some performance impact (although I don't personally notice any significant difference to running it "natively" on windows), and it may enable some extra functionality, such as access to OpenAi's Triton etc., but that's currently neither here nor there.
Triton: Runtime for highly efficient custom Deep-Learning primitives
1 project | news.ycombinator.com | 5 May 2023
Mojo – a new programming language for all AI developers
7 projects | news.ycombinator.com | 2 May 2023

Very cool development. There is too much busy work going from development to test to production. This will help to unify everything. OpenAI Triton https://github.com/openai/triton/ is going for a similar goal. But this is a more fundamental approach.

web-llm

Posts with mentions or reviews of web-llm. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-08.

What stack would you recommend to build a LLM app in React without a backend?
2 projects | /r/react | 8 Dec 2023
When LLM doesn’t fit into memory, how to make it work?
1 project | /r/LocalLLaMA | 5 Nov 2023

So I was playing with MLC webllm locally. I got my mistral 7B model installed and quantised. Converted it using mlc lib to metal package for Apple chips. Now it takes only 3.5GB of memory
Show HN: Ollama for Linux – Run LLMs on Linux with GPU Acceleration
14 projects | news.ycombinator.com | 26 Sep 2023

Maybe they're talking about https://github.com/mlc-ai/mlc-llm which is used for web-llm (https://github.com/mlc-ai/web-llm)? Seems to be using TVM.
Local embeddings model for javascript
1 project | /r/LangChain | 11 Jul 2023
this makes deploying AI language models so much easier
1 project | /r/OpenAI | 7 Jun 2023

Link to github for those who want to know about MLC straight from them. Web demo is cool but takes a long time to load first time. https://github.com/mlc-ai/web-llm
April 2023
40 projects | /r/dailyainews | 2 Jun 2023

web-llm: Bringing large-language models and chat to web browsers. (https://github.com/mlc-ai/web-llm)
Running a small model on a phone?
1 project | /r/LocalLLaMA | 21 May 2023
Weekly Megathread - 14 May 2023
11 projects | /r/AITechTips | 13 May 2023

WebLLM - https://mlc.ai/web-llm/
WebLLM - Bringing LLMs based chatbot to your web browser
1 project | /r/AITechTips | 11 May 2023
Google is bringing AI to the browser with WebGPU in Chrome
1 project | news.ycombinator.com | 11 May 2023

which makes this works in the browser
https://mlc.ai/web-llm/#chat-demo

What are some alternatives?

When comparing triton and web-llm you can also consider the following projects:

cuda-python - CUDA Python Low-level Bindings

chainlit - Build Conversational AI in minutes ⚡️

Halide - a language for fast, portable data-parallel computation

mlc-llm - Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

GPU-Puzzles - Solve puzzles. Learn CUDA.

gpt4all - gpt4all: run open-source LLMs anywhere

dfdx - Deep learning in Rust, with shape checked tensors and neural networks

StableLM - StableLM: Stability AI Language Models

cutlass - CUDA Templates for Linear Algebra Subroutines

FastChat - An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

maxas - Assembler for NVIDIA Maxwell architecture

duckdb-wasm - WebAssembly version of DuckDB

triton vs cuda-python web-llm vs chainlit triton vs Halide web-llm vs mlc-llm triton vs GPU-Puzzles web-llm vs gpt4all triton vs dfdx web-llm vs StableLM triton vs cutlass web-llm vs FastChat triton vs maxas web-llm vs duckdb-wasm

Compare triton vs web-llm and see what are their differences.

triton

web-llm

triton

web-llm

What are some alternatives?