Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge. Learn more →
Top 23 Python Japanese Projects
Open source Pan-CJK pixel font / 开源的泛中日韩像素字体
Optical character recognition for Japanese text, with the main focus being Japanese mangaProject mention: Any way to extract characters from images, or are there any apps/ tools that allow you to handwrite the characters? | /r/LearnJapanese | 2023-05-19
I use manga-ocr on pc
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
Implementation of riichi mahjong related stuff (hand cost, shanten, agari end, etc.)Project mention: I made a site to help you practice scoring in Riichi Mahjong :) | /r/Mahjong | 2022-10-01
Part credit should go to Nihisil on GitHub as I'm just using a function he created to spit out the details of the fu scoring.
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Japanese - English dictionary for Kindle based on the JMdict / EDICT databaseProject mention: Onyx Boox e-reader | /r/LearnJapanese | 2023-02-17
Mostly cause I got a good workflow going and I buy most of my ebooks from amazon. I use stuff like this anki add-on to mine sentences (although I have kinda stopped doing this lately) and got this jmdict dictionary for kindle which is pretty nice. I could probably set up something with the Onyx Boox reader and yomichan like others have mentioned but at this point I'm too lazy. Also the kindle is smaller and more portable, the Boox I have is slightly larger and better for manga though.
Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
A comparison tool of Japanese tokenizers
Updating dependencies is time-consuming.. Solutions like Dependabot or Renovate update but don't merge dependencies. You need to do it manually while it could be fully automated! Add a Merge Queue to your workflow and stop caring about PR management & merging. Try Mergify for free.
A JSON kanji dataset with updated JLPT levels and WaniKani information
jiten - japanese android/cli/web dictionary based on jmdict/kanjidic — 日本語 辞典 和英辞典 漢英字典 和独辞典 和蘭辞典
Word2vec (word to vectors) approach for Japanese language using Gensim and Mecab.Project mention: Abstract-Concreteness Value Lexical Data for Japanese | /r/linguistics | 2022-11-19
I'm looking for data for how concrete or abstract different lexical items are in Japanese, similar to this data for English. I'm not very well versed in computational linguistics, so even though I've found this word-to-vector model that can create vectors for Japanese words, but I'm not sure how to extrapolate abstractness values from the resulting vectors, or if that's even possible without using a predefined abstract-concrete vector like shown here.
Export UNIHAN's database to csv, json or yaml
Project mention: is doing 50 new words/day pushing it too far with Anki? | /r/LearnJapanese | 2023-03-24
Early immersion is going to be painful, especially when you just started this month, I think if you do the pre-build deck for kuma bear 1 sorted by local frequency to clear out a few hundred most frequent words in kuma bear 1 it might be doable, but the number of words is not the only factor, without at least n5-4 grammar it would be really difficult to jump into reading native material. i would recommend watching the ToKini Andy genki grammar videos and installing grammar dictionaries on your yomichan to look up grammar as you read
Anki add-on for generating furigana and pitch accent coloring & graphs, including optional flexible card stylingProject mention: Completed 6000 Japanese Words by Frequency Today! (+My Card Look) | /r/Anki | 2023-03-11
Full Kanji Koohii to Anki MigrationProject mention: Here is a list of reasons I stayed away from Kanji. Its from misunderstanding Kanji learning. I hope someone who isn't still learning Kanji or new to Kanji may read this and learn from my mistakes. | /r/LearnJapanese | 2022-12-28
I've just recently finished it and there is a whole suite of scripts to export it to Anki when you are done. Here
Japanese pop-up dictionary for qutebrowser
Create Anki cards from Kindle's Vocab-Builder and Yomichan dictionaries (by Kartoffel0)Project mention: What Japanese learning tools do you use on a regular basis? | /r/LearnJapanese | 2023-02-10
Simple script to generate flashcards for studying kanji
Interactive japanese kanji writing drill practice for anki with stroke orderProject mention: KanjiVG – SVGs of Kanji character strokes including order, shape and direction | news.ycombinator.com | 2023-02-21
Chinese character dictionary for learning Sino-xenic languagesProject mention: Office of the President of Mongolia (top to bottom text on the web) | news.ycombinator.com | 2023-04-24
I loved learning to read Japanese through the second volume of Heisig's _Learning the Kanji_. Volume 1, which teaches only meanings, is a slog, but volume 2, which teaches the Sino-Japanese readings is a beautiful example of organizing material to minimize entropy and maximize benefit for memorization as soon as possible. Unfortunately he never put together a volume 2 for a Chinese language. I haven't worked on it in a while, but I have a project where I attempt re-create the book for Japanese as well as Mandarin, Korean, and Vietnamese: https://nateglenn.com/uniunihan-db/ (repo: https://github.com/garfieldnate/uniunihan-db).
The "pure groups" are the ones where the presence of a specific radical guarantee you a specific pronunciation (within the list of character/pronunciation pairs you're trying to learn). Of the 4800 characters I used for the volume, only 290 are in the chapter on pure groups. The rest are either in semi-regular groups with varying numbers of exceptions, or in completely irregular groups with no discernible patterns.
The characters were designed continuously over a period of time starting thousands of years ago, and the phonetic parts were sometimes exact and sometimes just clues, similar sounds or rhymes to give the reader a hint. Ancient Chinese pronunciation has changed beyond recognition, so it makes perfect sense that the pronunciations wouldn't be regular anymore.
Mainland China uses a "simplified" character set, which did not affect literacy but in my opinion is a bit more difficult to read; they reduced the number of lines so that more characters look samey and they combined many (Mandarin) homonyms (https://en.wikipedia.org/wiki/Simplified_Chinese_characters#...), removing the meaning portion of characters that would have distinguished them. The simplification did not apply to all characters, so to achieve a high level of literacy you need to know traditional forms, anyway.
It would be interesting to see someone try to actually remodel hanzi from scratch for a specific dialect of Chinese, using 100% regular phonetic components and no variants; multiple pronunciations of a character in the current system would be required to be written differently. An interesting example of this would be certain Korean gukja, where they've combined a Chinese character with a phonetic hangeul (example: https://en.wiktionary.org/wiki/%E3%AB%87). This would be a truly simplified Chinese character set... but all of the culture's history that gets built into spelling over time would be completely lost, which is why I always prefer conservative spelling systems.
(more than just) A Python wrapper for the Sakura Paris (Japanese) Dictionary API. All definitions are monolingual.
Translation project of "Phantom Brigade". Check "Releases".
An Anki add-on for adding the word frequency to the Japanese words in a specific deck.
Create worksheet to learn Asian language (eg. Chinese) and practice reading and writing in grid format. Perfect tool for kid and beginner.Project mention: I made this tool to help my kids learn Chinese | /r/Python | 2023-06-05
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
Python Japanese related posts
Any way to extract characters from images, or are there any apps/ tools that allow you to handwrite the characters?
1 project | /r/LearnJapanese | 19 May 2023
Do you guys know where I can read the translated version of Isekai Joshi Kangoku?
2 projects | /r/shoujoai | 16 May 2023
How do you read Japanese?
2 projects | /r/LearnJapanese | 10 May 2023
Looking for a program for quick word extraction WITHOUT leaving the screen?
1 project | /r/LearnJapanese | 2 May 2023
easy manga that writes left to right (horizontal) and uses kanji with furigana
1 project | /r/LearnJapanese | 28 Apr 2023
What is the most accurate Windows OCR
3 projects | /r/LearnJapanese | 18 Apr 2023
is doing 50 new words/day pushing it too far with Anki?
1 project | /r/LearnJapanese | 24 Mar 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Sep 2023
What are some of the best open-source Japanese projects in Python? This list will help you: