Tokenization

Top 20 Tokenization Open-Source Projects

  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

  • Project mention: Step by step guide to create customized chatbot by using spaCy (Python NLP library) | dev.to | 2024-03-23

    Hi Community, In this article, I will demonstrate below steps to create your own chatbot by using spaCy (spaCy is an open-source software library for advanced natural language processing, written in the programming languages Python and Cython):

  • lunasec

    LunaSec - Dependency Security Scanner that automatically notifies you about vulnerabilities like Log4Shell or node-ipc in your Pull Requests and Builds. Protect yourself in 30 seconds with the LunaTrace GitHub App: https://github.com/marketplace/lunatrace-by-lunasec/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Databunker

    Secure SDK/vault for personal records/PII built to comply with GDPR

  • Ravencoin

    Ravencoin Core integration/staging tree

  • trankit

    Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

  • razdel

    Rule-based token, sentence segmentation for Russian language

  • TokenScript

    TokenScript schema, specs and paper

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • simplemma

    Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

  • l8w8jwt

    Minimal, OpenSSL-less and super lightweight JWT library written in C.

  • Project mention: L8w8jwt – a minimal, OpenSSL-less and lightweight JWT library written in C | news.ycombinator.com | 2023-07-23
  • python-fpe

    FPE - Format Preserving Encryption with FF3 in Python

  • cashtokens

    A proposal to enable two new primitives on Bitcoin Cash: fungible tokens and non-fungible tokens.

  • Project mention: Cashtokens? | /r/btc | 2023-07-02

    Useful References: https://cashtokens.org/ https://helpme.cash/CashTokens/

  • bioseq

    Tokenizers and Machine Learning Models for biological sequence data

  • Project mention: ResearchAgent: Iterative Research Idea Generation Using LLMs | news.ycombinator.com | 2024-04-20
  • wink-eng-lite-model

    English lite language model for wink-nlp.

  • boolean-expression-parser

    Boolean expression parser

  • cashtokens.org

    A community-maintained website about the CashTokens technology, including technical specifications, documentation, guides, and other resources.

  • maleeni

    A lexer generator for golang

  • pass3d

    3D object recognition CLI tool for Linux

  • dave.liquid.wine

    Blockstream AMP tokenization example

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Tokenization related posts

Index

What are some of the best open-source Tokenization projects? This list will help you:

Project Stars
1 spaCy 28,704
2 lunasec 1,406
3 Databunker 1,208
4 Ravencoin 1,070
5 trankit 705
6 razdel 243
7 TokenScript 238
8 simplemma 125
9 l8w8jwt 123
10 python-fpe 79
11 cashtokens 44
12 xontrib-output-search 37
13 bioseq 19
14 mongo-search 14
15 wink-eng-lite-model 10
16 boolean-expression-parser 10
17 cashtokens.org 9
18 maleeni 5
19 pass3d 4
20 dave.liquid.wine 0

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com