hgrep-smallcore
byteseek
hgrep-smallcore | byteseek | |
---|---|---|
2 | 2 | |
1 | 38 | |
- | - | |
10.0 | 0.0 | |
almost 2 years ago | almost 3 years ago | |
Haskell | Java | |
BSD 3-clause "New" or "Revised" License | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hgrep-smallcore
-
Rulex – A new, portable, regular expression language
Hold on! See: https://github.com/dan-blank/hgrep-smallcore/issues/1 Using `fromJust` is deeply frowned upon, and indeed a few eyebrows were raised then showing my code to the examiners. This and a handful other warts unfortunate, but few and easy to work around.
This function and others are in the prelude, but one can use other preludes that don't have these escape hatches.
This does not undermine the huge benefits that people get from having IO checked by the typesystem. Just as having tyepcasting in a language does not undermine the benefits of types in that language.
byteseek
-
Rulex – A new, portable, regular expression language
Interesting. It's very similar to a regex language I created for byte-oriented regular expressions [0]
Similar usability principles: delimitated strings, ignore whitespace, and comments.
[0] https://github.com/nishihatapalmer/byteseek/blob/master/synt...
-
Knuth-Morris-Pratt string-searching algorithm: DFA-less version
That was a fun read, I liked the use of cmbc to validate the algorithm.
For those who are interested, there's a good tool to specifically test string matching algorithms here:
https://github.com/smart-tool/smart
There are so many string matching algorithms now, with different best and worst cases. Some work better on low alphabets (eg DNA), so are better for text or high entropy data, some take advantage of CPU instructions, some are generic. The real challenge is picking the right algorithm.
I've implemented a few of them in java here, and extended them to support multi byte matching at any position:
https://github.com/nishihatapalmer/byteseek
What are some alternatives?
pomsky - A new, portable, regular expression language
kleenexp - modern regular expression syntax everywhere with a painless upgrade path
melody - Melody is a language that compiles to regular expressions and aims to be more readable and maintainable
almson-regex - A simple library for writing readable regular expressions.
roman-arabic-calculator - This code is a proof of concept. The calculator can work with both Arabic (1,2,3,4,5 ...) and Roman (I, II, III, IV, V ...) numbers.
remake
oil - Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
thesis
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.