Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
I have been using the pretrained tokenizers available from the huggingface/transformers library. And they have been working well for my use case.
For papers, take a look at references here https://github.com/google/sentencepiece
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- OpenAI – Application for US trademark "GPT" has failed
- sentencepiece
- Ernie, China's ChatGPT, Cracks Under Pressure
- HF Transfer: Speed up file transfers
- [P] TokenMonster Ungreedy ~ 35% faster inference and 35% increased context-length for large language models (compared to tiktoken). Benchmarks included.