thesis
byteseek
thesis | byteseek | |
---|---|---|
1 | 2 | |
1 | 38 | |
- | - | |
10.0 | 0.0 | |
almost 4 years ago | almost 3 years ago | |
TeX | Java | |
- | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
thesis
byteseek
-
Rulex – A new, portable, regular expression language
Interesting. It's very similar to a regex language I created for byte-oriented regular expressions [0]
Similar usability principles: delimitated strings, ignore whitespace, and comments.
[0] https://github.com/nishihatapalmer/byteseek/blob/master/synt...
-
Knuth-Morris-Pratt string-searching algorithm: DFA-less version
That was a fun read, I liked the use of cmbc to validate the algorithm.
For those who are interested, there's a good tool to specifically test string matching algorithms here:
https://github.com/smart-tool/smart
There are so many string matching algorithms now, with different best and worst cases. Some work better on low alphabets (eg DNA), so are better for text or high entropy data, some take advantage of CPU instructions, some are generic. The real challenge is picking the right algorithm.
I've implemented a few of them in java here, and extended them to support multi byte matching at any position:
https://github.com/nishihatapalmer/byteseek
What are some alternatives?
kleenexp - modern regular expression syntax everywhere with a painless upgrade path
hgrep-smallcore - University project: Haskell implementation of https://www.ccs.neu.edu/home/turon/re-deriv.pdf, with a very small internal regex representation.
almson-regex - A simple library for writing readable regular expressions.
regex - An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.
melody - Melody is a language that compiles to regular expressions and aims to be more readable and maintainable
roman-arabic-calculator - This code is a proof of concept. The calculator can work with both Arabic (1,2,3,4,5 ...) and Roman (I, II, III, IV, V ...) numbers.
oil - Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
remake
swift-evolution - This maintains proposals for changes and user-visible enhancements to the Swift Programming Language.