llama-tokenizer-js VS gpt-tokenizer

Compare llama-tokenizer-js vs gpt-tokenizer and see what are their differences.

gpt-tokenizer

JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT-2 / GPT-3 / GPT-4. Port of OpenAI's tiktoken with additional features. (by niieani)
SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
llama-tokenizer-js gpt-tokenizer
5 1
309 388
- -
7.1 3.3
26 days ago 5 days ago
JavaScript TypeScript
MIT License MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

llama-tokenizer-js

Posts with mentions or reviews of llama-tokenizer-js. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-13.

gpt-tokenizer

Posts with mentions or reviews of gpt-tokenizer. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-13.
  • I wrote a tokenizer for LLaMA that runs inside the browser
    2 projects | /r/LocalLLaMA | 13 Jun 2023
    There are more differences between GPT2 tokenizer and LLaMA tokenizer than only the vocab and merge data. It would take me some time to do implement a GPT2 tokenizer, and there are already good alternatives for those, so it wouldn't make sense to put time into making another one. For example, this library contains a GPT2 tokenizer: https://github.com/niieani/gpt-tokenizer

What are some alternatives?

When comparing llama-tokenizer-js and gpt-tokenizer you can also consider the following projects:

Constrained-Text-Generation-Studio - Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) workshop, jointly held at (COLING 2022)

Lobe Chat - LobeChat is a open-source, extensible (Function Calling), high-performance chatbot framework.It supports one-click free deployment of your private ChatGPT/LLM web application.

fastbpe - Java library implementing Byte-Pair Encoding Tokenization

openai-caching-proxy-worker - caching proxy for OpenAI API, deployable as a Cloudflare Worker

Constrained-Text-Genera

bpe-encoder-php - BPE (Byte-Pair Encoding) Encoder Decoder for OpenAI's GPT-2 / GPT-3 Implemented In Pure PHP, Zero Dependency, Multi Byte Supported.

agency - Agency: Robust LLM Agent Management with Go

auto-gpt-web - Set Your Goals, AI Achieves Them.

tokenizer - Pure Go implementation of OpenAI's tiktoken tokenizer

tiktoken - JS port and JS/WASM bindings for openai/tiktoken

gpt4-tokenizer-visualizer - GPT4 Tokenizer Visualizer