hfst
common-regex
hfst | common-regex | |
---|---|---|
3 | 1 | |
116 | 0 | |
0.9% | - | |
4.0 | 10.0 | |
about 1 month ago | over 5 years ago | |
C++ | Rust | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hfst
- A portable, modern regular expression language
-
Search-and-replace with correct grammatical case - does it exist?
So you want to go from Spende to Spende+genitive to Beitrag+genitive to Beitrags. In addition to Spacy etc. you might look at Finite State Transducers, which I believe are bidirectional, i.e. for both analysis and generation. XFST and SFST and OpenFST are a few of the FST toolkits. See https://github.com/hfst/hfst for the Helsinki FST; there's a German transducer for it at https://sourceforge.net/projects/hfst/files/resources/morphological-transducers/hfst-german-installable.tar.gz/download. I don't think there is much of a learning curve, and there should be plenty of documentation.
-
Foldable Words
The regex syntax is a bit quirky due to backwards compatibility with lexicons written in XFST, see https://github.com/hfst/hfst/wiki/Regular-Expression-Operato...
common-regex
-
A portable, modern regular expression language
comparitor.insert("iyr", Regex::new(r"^(201[0-9]|2020)$").unwrap());
There are lots of number parsing.
I would enable both [[:1-12:]] and [[:01-12:]] as options without / with leading zeros.
About the variables:
This file would look much more readable with variables that are reusing other regexes:
https://github.com/spcan/common-regex/blob/3238bc8ee85e0e000...
What are some alternatives?
HFSM2 - High-Performance Hierarchical Finite State Machine Framework
fluent-plugin-grok-parser - Fluentd's Grok parser
lttoolbox - Finite state compiler, processor and helper tools used by apertium
rx - Standalone version of Emacs' rx macro
apertium - Core tools (driver script, transfer, tagger, formatters) for the FOSS RBMT system Apertium
kbnf - KBNF has been renamed to Dogma
simplenlg - Java API for Natural Language Generation. Originally developed by Ehud Reiter at the University of Aberdeen’s Department of Computing Science and co-founder of Arria NLG. This git repo is the official SimpleNLG version.
logstash-patterns - Grok patterns for parsing and structuring log messages with logstash
apertium-lex-tools - Module for compiling lexical selection rules and processing them in the pipeline.
oil - Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
JSVerbalExpressions - JavaScript Regular expressions made easy