janome
asian-comprehension-worksheet-generator
janome | asian-comprehension-worksheet-generator | |
---|---|---|
2 | 1 | |
828 | 0 | |
- | - | |
5.2 | 2.5 | |
11 months ago | 12 months ago | |
Python | Python | |
Apache License 2.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
janome
- [discussion] Open AI api translations
-
[Computer Stuff] What's the best way to split a Japanese sentence into "words"?
I did program stuff like that a bit in Korean and Japanese. So, in short, these tools/libraries are called 'Tokenizers'. I.e. search for "Japanese tokenizer", it will also tell you that MeCab is one of them. There is no good/easy way to split words in Japanese with simple algorithms, so these libraries, that are based on statistics or AI, will be your only choice. There is a good example sentence that shows how futile this would be without those libraries: "すもももももももものうち". From here.
asian-comprehension-worksheet-generator
What are some alternatives?
kanji-data - A JSON kanji dataset with updated JLPT levels and WaniKani information
chinese-xinhua - :orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
tika-python - Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
wakaranai - An educational tool for learning hiragana and katakana
chinese-shadowing - Application for shadowing Chinese.
skweak - skweak: A software toolkit for weak supervision applied to NLP tasks
miteiru - Miteiru is an open source Electron video player to learn Chinese, Cantonese, and Japanese. It can play all Youtube and HTML 5 supported format (.mkv, .mp4, .mov, and many more) videos, and lots of supports on other subtitle formats (.srt, .ass, .vtt, and many more)
transformers - 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
languagepod101-scraper - Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python