prose
textcat
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
prose
-
Go+: Go designed for data science
Apart from Gonum[1] numerical libraries, I haven't found specific data science related Go libraries in my search for it for some hobby projects when compared to Python ecosystem.
Interestingly Prose[2] A Go library for text processing yielded better results for named-entity extraction when compared to NLTK in my tests in terms of accuracy and obviously performance.
Perhaps Go is not being applied enough in the Data Science/ML and for fields where it's applied (Network) Math in the standard library seems to be sufficient.
[1] https://github.com/gonum/gonum
[2] https://github.com/jdkato/prose
textcat
We haven't tracked posts mentioning textcat yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
gse - Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.
nlp - [UNMANTEINED] Extract values from strings and fill your structs with nlp.
go-i18n - Translate your Go program into multiple languages.
shamoji - The shamoji (杓文字) is a word filtering package
porter2 - High Performance Porter2 Stemmer
go-nlp
gojieba - "结巴"中文分词的Golang版本
golibstemmer - Go bindings for the snowball libstemmer library including porter 2
go-mystem - CGo bindings to Yandex.Mystem
sentences - A multilingual command line sentence tokenizer in Golang
MMSEGO - Chinese word splitting algorithm MMSEG in GO