ichiran
rakutenma-python
ichiran | rakutenma-python | |
---|---|---|
3 | 1 | |
278 | 21 | |
- | - | |
0.0 | 10.0 | |
3 months ago | almost 7 years ago | |
Common Lisp | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ichiran
-
I'm looking for a reliable Japanese word segmentation algorithm
Check out ichi.moe. The word detection and splitting is quite good, and the backend is available on Github as ichiran. Unfortunately for most sane developers, the backend is written in Lisp.
-
Function & Variable Naming Conventions?
Here's an example from my codebase which uses a lot of creative naming. There are "suffixes" (which I guess is a grammar term) but also "patches", "penalties", "synergies", "segfilters" and so on which are the terms I made up solely for this code.
-
Starting a batteries-included extended standard library project. Request for comments.
You say batteries-included, I see a kitchen sink. Not everything needs huge libraries like ironclad loaded in, and every new dependency is a potential breakage in the future. I like to occasionally look at the dependencies list in my projects' .asd and see if I can get rid of something. For example I used cl-str only for its join function... And then I saw how it's implemented. I mean, really???. I rolled my own join instead. But if everyone starts using these battery-included kitchen sinks, I would still be loading a bunch of libraries I don't ever intend to use. I hear it's a big problem in node.js community.
rakutenma-python
-
I'm looking for a reliable Japanese word segmentation algorithm
There are some projects on github that seem promising (ex https://github.com/ikegami-yukino/rakutenma-python ) but I just have to re-emphasize that even if the computer is getting 95%+ of the sentences right, a leaner is going to be looking for help with that remaining 5% and the computer will never have it.
What are some alternatives?
languagepod101-scraper - Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
ginza - A Japanese NLP Library using spaCy as framework based on Universal Dependencies
yomichan - Japanese pop-up dictionary extension for Chrome and Firefox.
JL - JL is a program for looking up Japanese words and expressions.
CIEL - CIEL Is an Extended Lisp. Scripting with batteries included.
common-lisp-standard-library
3bmd - markdown processor in CL using esrap parser
cl-utils - GrammaTech Common Lisp Utilities
jp-verb-deconjugator - Unconjugate conjugated Japanese verbs.
tune - An Intermediate Constructed Language
nyxt - Nyxt - the hacker's browser.