A tokenizer based on the dictionary and Bigram language models for Go. (Now only support chinese segmentation) (by xujiajun)


Basic gotokenizer repo stats
about 2 years ago

xujiajun/gotokenizer is an open source project licensed under Apache License 2.0 which is an OSI approved license.

Gotokenizer Alternatives

Similar projects and alternatives to gotokenizer based on common topics and language

  • GitHub repo olivia

    💁‍♀️Your new best friend powered by an artificial neural network

  • GitHub repo prose

    :book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

  • GitHub repo go-i18n

    Translate your Go program into multiple languages.

  • GitHub repo gse

    Go efficient text segmentation and NLP; support english, chinese, japanese and other. Go 语言高性能分词

  • GitHub repo gojieba


  • GitHub repo when

    A natural language date/time parser with pluggable rules (by olebedev)

  • GitHub repo go-pinyin


NOTE: The number of mentions on this list indicates mentions on common posts. Hence, a higher number means a better gotokenizer alternative or higher similarity.


Posts where gotokenizer has been mentioned. We have used some of these posts to build our list of alternatives and similar projects.

We don't know posts mentioning gotokenizer yet. We started tracking mentions in Dec 2020.