A portable, modern regular expression language

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • hfst

    Helsinki Finite-State Technology (library and application suite)

  • common-regex

    Most common regex

  • comparitor.insert("iyr", Regex::new(r"^(201[0-9]|2020)$").unwrap());

    There are lots of number parsing.

    I would enable both [[:1-12:]] and [[:01-12:]] as options without / with leading zeros.

    About the variables:

    This file would look much more readable with variables that are reusing other regexes:

    https://github.com/spcan/common-regex/blob/3238bc8ee85e0e000...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ReadableRegex.jl

    regexes for people who don't really want to learn or read regexes

  • I find something like this a lot more readable:

    https://github.com/jkrumbiegel/ReadableRegex.jl

    It is in Julia, but if you have it installed locally it’s just a few taps away. You can even generate the regex, and use that in Python and just add the ReadableRegex in a comment nearby.

  • rx

    Standalone version of Emacs' rx macro (by sulami)

  • I had a similar kind of idea for a long time, which I put into action a few weeks ago via a standalone transpiler of Emacs' rx macro to common regexp syntaxes.[0] I ended up getting interrupted and didn't completely finish it, but it generally works, though is probably riddled with edge cases.

    The basic idea of rx is to use S-expressions to describe regular expressions, and my elevator pitch would've been to embed rx invocations in shell scripts using $(syntax), the main use case being something like sed invocations.

    I still think it's a neat idea, and complex regular expressions tend to be hard to parse for humans.

    [0]: https://github.com/sulami/rx

  • logstash-patterns

    Grok patterns for parsing and structuring log messages with logstash

  • Why don't languages have grok patterns in their standard libraries?

    It seems to only exist in log parsing ecosystems but this really helps with getting rid of little bugs and wrong parsing of specific regex patterns.

    Instead of doing "^\d+(\.\d+){3}$" for IP checking which is clearly wrong, you'd do "%{IPV4:ip}" which is so much better.

    List of known patterns : https://github.com/hpcugent/logstash-patterns/blob/master/fi...

    Even for PHP a third party library only has 15 stars.

  • JSVerbalExpressions

    JavaScript Regular expressions made easy

  • I agree with you. I got tired of fighting with regex where I got to the point of simply not using it if at all possible.

    A comment further up offered a very promising alternative.

    https://github.com/VerbalExpressions/JSVerbalExpressions#tes...

    It's a bit verbose, but I don't care anymore, I am too much a veteran to care about my code being sleek, I want it readable and workable.

  • kbnf

    KBNF has been renamed to Dogma

  • The fundamental problem comes from assigning meaning to whitespace (in this case, concatenation). I had the same issues when developing KBNF ( https://github.com/kstenerud/kbnf/blob/master/kbnf.md ) which operates in a closely related space.

    In early development, I took a number of cues from regex that turned out to be bad ideas, in particular using whitespace for concatenation (which all BNF dialects seem to do).

    Switching to '&' for concatenation fixed it and made things a lot clearer, as it would also do for Pomsky:

        'Hello' & ' '+ & ('world' | 'pomsky')

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • fluent-plugin-grok-parser

    Fluentd's Grok parser

  • It may have originated from a project but many other log parser projects such as Vector and fluentd have such support.

    https://vector.dev/docs/reference/vrl/examples/#parse_grok

    https://github.com/fluent/fluent-plugin-grok-parser

  • oil

    Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!

  • FWIW here is a list of other such projects: https://github.com/oilshell/oil/wiki/Alternative-Regex-Synta...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts