kagome
prose
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kagome
-
How do MeCab, Kuromoji and Kagome (Japanese Text Analyzer) compare; and which dictionary to choose?
Kagome is a more recently updated library implemented in Golang.
prose
-
Go+: Go designed for data science
Apart from Gonum[1] numerical libraries, I haven't found specific data science related Go libraries in my search for it for some hobby projects when compared to Python ecosystem.
Interestingly Prose[2] A Go library for text processing yielded better results for named-entity extraction when compared to NLTK in my tests in terms of accuracy and obviously performance.
Perhaps Go is not being applied enough in the Data Science/ML and for fields where it's applied (Network) Math in the standard library seems to be sufficient.
[1] https://github.com/gonum/gonum
[2] https://github.com/jdkato/prose
What are some alternatives?
Sudachi - A Japanese Tokenizer for Business
gse - Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.
go-i18n - Translate your Go program into multiple languages.
gojieba - "结巴"中文分词的Golang版本
textcat - A Go package for n-gram based text categorization, with support for utf-8 and raw text
sentences - A multilingual command line sentence tokenizer in Golang
porter2 - High Performance Porter2 Stemmer
getlang - Natural language detection package in pure Go
go-mystem - CGo bindings to Yandex.Mystem