Rulex – A new, portable, regular expression language

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • pomsky

    A new, portable, regular expression language

  • - https://rulex-rs.github.io/ - Very similar to legacy regex syntax, supports macros and number ranges, supports unicode, _amazing_ error messages help convert legacy to new syntax, backslash escapes only for quotes. Rust compiler, as of today no built in way to use outside rust (but they seem to be planning it).

      ('What is your ' ('name'|'quest'|'favorite colour')'?' [s]){1,3}

  • kleenexp

    modern regular expression syntax everywhere with a painless upgrade path

  • I've collected the different projects along with a nontrivial syntax example here: https://github.com/SonOfLilit/kleenexp#similar-works

    - Regular Expressions - very popular, occasionally reads like line noise, backslash for escape

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • melody

    Melody is a language that compiles to regular expressions and aims to be more readable and maintainable

  • [1-3 'What is your ' ['name' | 'quest' | 'favourite colour'] '?' [0-1 #space]]

    - https://github.com/yoav-lavi/melody - More verbose, supports macros, backslash escapes only for quotes. Rust compiler, babel plugin. Improves with time, getting quite impressive.

  • hgrep-smallcore

    University project: Haskell implementation of https://www.ccs.neu.edu/home/turon/re-deriv.pdf, with a very small internal regex representation.

  • Yes and straighforwardly so if you use character classes as your basic building blocks. Here I implemented a Haskell implementation that is easily extandable to include complements: https://github.com/dan-blank/hgrep-smallcore (I like this project because it translates ERE compliant regexes down to only 4 constructs, one of which being character classes). It implements https://www.ccs.neu.edu/home/turon/re-deriv.pdf, character classes are described in 4.2.

    I actually had complement in it as a 5th construct, but when the submission came closer and the examiners found some errors in my logic (my fault for not writing good enough unit tests!), I took complement out again.

  • remake

  • Also notable is Remake, which has Rust bindings: https://github.com/ethanpailes/remake

  • thesis

  • byteseek

    A Java library for byte pattern matching and searching

  • Interesting. It's very similar to a regex language I created for byte-oriented regular expressions [0]

    Similar usability principles: delimitated strings, ignore whitespace, and comments.

    [0] https://github.com/nishihatapalmer/byteseek/blob/master/synt...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • swift-evolution

    This maintains proposals for changes and user-visible enhancements to the Swift Programming Language.

  • For simple regexes, Swift has short literals, and (AFAIK) you can mix and match the DSL and the short literals. https://github.com/apple/swift-evolution/blob/main/proposals... gives this example:

      // A regex for extracting a currency (dollars or pounds) and amount from input

  • RE2

    RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library.

  • On a related note, if you have Python regex code that you want to make more stable/performant, https://pypi.org/project/pyre2/ is a drop-in replacement for `re` that (configurably) falls back to `re` if you use lookaheads, etc.

    The design philosophy behind RE2 for those unfamiliar with the library: https://github.com/google/re2/wiki/WhyRE2

  • lit

    Lit is a simple library for building fast, lightweight web components.

  • > Are you honestly implying that there are still people who, in all seriousness, use a REGEX to parse HTML

    Subsets of it? Yes. See Google's lit-html as an example: https://github.com/lit/lit/blob/main/packages/lit-html/src/l...

  • RegExr

    RegExr is a HTML/JS based tool for creating, testing, and learning about Regular Expressions.

  • RegExr (https://regexr.com/) doesn't come up enough in these discussions. One of the nicest regex debugging/development tools on the internet today.

  • oil

    Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!

  • I added this to the Alternative Regex Syntax wiki page with about a dozen simlar projects:

    https://github.com/oilshell/oil/wiki/Alternative-Regex-Synta...

    e.g. compare with Melody 3 months ago: https://news.ycombinator.com/item?id=30358554

    and Oil's Eggex:

    https://www.oilshell.org/release/latest/doc/eggex.html

    From a quick glance Rulex looks very similar to Eggex!

    A difference is that Eggex is embedded in a shell so you can use normal assignment statements to build up subpatterns. And you can also interpolate directly into an 'egrep' or 'awk' command.

  • compose-regexp.js

    Build and compose maintainable regular expressions in JavaScript.

  • coffeescript

    Unfancy JavaScript

  • After looking at all the examples I can't say I'm a fan. Sometimes it's even more verbose than standard regular expressions. Over the years I've become quite familiar with regexp so maybe I'm just biased, but I'd rather have something like CoffeeScript's block expressions instead, where you can easily group and document each part:

    https://coffeescript.org/#regexes

  • regex

    An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.

  • We were talking about EREs, which are an artifact of POSIX, not UTS#18. So the relevant standard for this specific conversion is POSIX.

    To redirect to UTS#18, I don't think UTS#18 subsumes POSIX. UTS#18 doesn't support [[=a=]] for example AFAIK. And UTS#18 more generally doesn't require locale support. UTS#18 Level 3 was actually removed from the spec.

    I think UTS#18 is a tortured document, but yes, the regex crate supports pretty much all of UTS#18 Level 1: https://github.com/rust-lang/regex/blob/master/UNICODE.md

    Going beyond Level 1 is difficult.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts