hfst
rx
hfst | rx | |
---|---|---|
3 | 1 | |
116 | 3 | |
0.9% | - | |
4.0 | 3.1 | |
about 1 month ago | 6 months ago | |
C++ | Rust | |
GNU General Public License v3.0 only | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hfst
- A portable, modern regular expression language
-
Search-and-replace with correct grammatical case - does it exist?
So you want to go from Spende to Spende+genitive to Beitrag+genitive to Beitrags. In addition to Spacy etc. you might look at Finite State Transducers, which I believe are bidirectional, i.e. for both analysis and generation. XFST and SFST and OpenFST are a few of the FST toolkits. See https://github.com/hfst/hfst for the Helsinki FST; there's a German transducer for it at https://sourceforge.net/projects/hfst/files/resources/morphological-transducers/hfst-german-installable.tar.gz/download. I don't think there is much of a learning curve, and there should be plenty of documentation.
-
Foldable Words
The regex syntax is a bit quirky due to backwards compatibility with lexicons written in XFST, see https://github.com/hfst/hfst/wiki/Regular-Expression-Operato...
rx
-
A portable, modern regular expression language
I had a similar kind of idea for a long time, which I put into action a few weeks ago via a standalone transpiler of Emacs' rx macro to common regexp syntaxes.[0] I ended up getting interrupted and didn't completely finish it, but it generally works, though is probably riddled with edge cases.
The basic idea of rx is to use S-expressions to describe regular expressions, and my elevator pitch would've been to embed rx invocations in shell scripts using $(syntax), the main use case being something like sed invocations.
I still think it's a neat idea, and complex regular expressions tend to be hard to parse for humans.
[0]: https://github.com/sulami/rx
What are some alternatives?
HFSM2 - High-Performance Hierarchical Finite State Machine Framework
logstash-patterns - Grok patterns for parsing and structuring log messages with logstash
lttoolbox - Finite state compiler, processor and helper tools used by apertium
fluent-plugin-grok-parser - Fluentd's Grok parser
apertium - Core tools (driver script, transfer, tagger, formatters) for the FOSS RBMT system Apertium
common-regex - Most common regex
simplenlg - Java API for Natural Language Generation. Originally developed by Ehud Reiter at the University of Aberdeen’s Department of Computing Science and co-founder of Arria NLG. This git repo is the official SimpleNLG version.
oil - Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
apertium-lex-tools - Module for compiling lexical selection rules and processing them in the pipeline.
ReadableRegex.jl - regexes for people who don't really want to learn or read regexes
kbnf - KBNF has been renamed to Dogma