konoha
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code. (by himkt)
toiro
A comparison tool of Japanese tokenizers (by taishi-i)
konoha | toiro | |
---|---|---|
1 | 1 | |
214 | 112 | |
- | - | |
7.8 | 5.2 | |
9 days ago | 10 months ago | |
Python | Python | |
MIT License | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
konoha
Posts with mentions or reviews of konoha.
We have used some of these posts to build our list of alternatives
and similar projects.
toiro
Posts with mentions or reviews of toiro.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2021-08-04.
-
Any recommendations for a good Japanese NLP engine?
Thank you! I have also been looking at Toiro which is not a NLP but a comparison tool, and it includes MeCab. You can use it to install all Japanese language parsers (that it knows about) and then run tests on your data set. Right now I'm running each one on the game script I have and see which one is best.
What are some alternatives?
When comparing konoha and toiro you can also consider the following projects:
fugashi - A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.
pykakasi - Lightweight converter from Japanese Kana-kanji sentences into Kana-Roman.
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
pythainlp - Thai Natural Language Processing in Python.
jProcessing - Japanese Natural Langauge Processing Libraries