gpt-tokenizer

JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT-2 / GPT-3 / GPT-4. Port of OpenAI's tiktoken with additional features. (by niieani)

Gpt-tokenizer Alternatives

Similar projects and alternatives to gpt-tokenizer based on common topics and language

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better gpt-tokenizer alternative or higher similarity.

gpt-tokenizer reviews and mentions

Posts with mentions or reviews of gpt-tokenizer. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-13.
  • I wrote a tokenizer for LLaMA that runs inside the browser
    2 projects | /r/LocalLLaMA | 13 Jun 2023
    There are more differences between GPT2 tokenizer and LLaMA tokenizer than only the vocab and merge data. It would take me some time to do implement a GPT2 tokenizer, and there are already good alternatives for those, so it wouldn't make sense to put time into making another one. For example, this library contains a GPT2 tokenizer: https://github.com/niieani/gpt-tokenizer

Stats

Basic gpt-tokenizer repo stats
1
379
3.3
14 days ago

niieani/gpt-tokenizer is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of gpt-tokenizer is TypeScript.


Sponsored
The modern identity platform for B2B SaaS
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
workos.com