kagome
Sudachi
kagome | Sudachi | |
---|---|---|
1 | 2 | |
789 | 741 | |
- | 0.7% | |
6.4 | 5.2 | |
12 days ago | 21 days ago | |
Go | Java | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kagome
-
How do MeCab, Kuromoji and Kagome (Japanese Text Analyzer) compare; and which dictionary to choose?
Kagome is a more recently updated library implemented in Golang.
Sudachi
-
Python Text Parsing Project: Furigana Inserter for Anki
Instead of the common segmentation tool Mecab, this project will use Sudachi, which features multiple text segmentation modes as well as Furigana retrieval.
-
Gauging interest and plausibility of an overhaul of Anki's Morphman
I don't think there's anything special about Ichiran here*, rather, as you observe, MeCab isn't quite the right tool for the job. A quick google suggests that people sometimes follow MeCab with J.DepP to end up with bunsetsu, which is (I think) what you'd want for Morphman. Sudachi has python bindings and offers a couple of different levels of aggression/granularity.
What are some alternatives?
gse - Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.
SudachiPy - Python version of Sudachi, a Japanese tokenizer.
gojieba - "结巴"中文分词的Golang版本
MorphMan - Anki plugin that reorders language cards based on the words you know
prose - :book: A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.
wanakana-py - Port of wanakana by the WaniKani team
sentences - A multilingual command line sentence tokenizer in Golang
go-i18n - Translate your Go program into multiple languages.
getlang - Natural language detection package in pure Go
whatlanggo - Natural language detection library for Go
gounidecode - Unicode transliterator for #golang
iuliia-go - Transliterate Cyrillic → Latin in every possible way