whatlanggo
gocld3
Our great sponsors
whatlanggo | gocld3 | |
---|---|---|
1 | 1 | |
624 | 18 | |
- | - | |
0.0 | 0.0 | |
about 1 year ago | 11 months ago | |
Go | C++ | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
whatlanggo
-
Announcing Lingua 1.0.0: The most accurate natural language detection library in the Go ecosystem, suitable for long and short text alike
So far, the only other comprehensive open source library in the Go ecosystem for this task is Whatlanggo. Unfortunately, it has two major drawbacks:
gocld3
-
Announcing Lingua 1.0.0: The most accurate natural language detection library in the Go ecosystem, suitable for long and short text alike
My solution is based on https://github.com/jmhodges/gocld3 I actually precompiled C/C++ part (that include CLD3 and protobuf) into library to speed up CI builds but for simple benchmark you should be able to compile the projects as is.
What are some alternatives?
sentences - A multilingual command line sentence tokenizer in Golang
lingua-go - The most accurate natural language detection library for Go, suitable for short text and mixed-language text
go-stem - Word Stemming in Go
go-mystem - CGo bindings to Yandex.Mystem
universal-translator - :speech_balloon: i18n Translator for Go/Golang using CLDR data + pluralization rules
petrovich - Golang port of Petrovich - an inflector for Russian anthroponyms.
gse - Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.
go-unidecode - ASCII transliterations of Unicode text.
gounidecode - Unicode transliterator for #golang
gojieba - "结巴"中文分词的Golang版本
nlp - [UNMANTEINED] Extract values from strings and fill your structs with nlp.
gotokenizer - A tokenizer based on the dictionary and Bigram language models for Go. (Now only support chinese segmentation)