The Awk State Machine Parser Pattern (2018)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • awk-libs

    GNU awk libraries

  • md2html.awk

  • I used a similar technique for my awk markdown parser: https://github.com/yiyus/md2html.awk

    An awk state machine is a quite straightforward way to deal with data like this log file. It is not so clear that this is the best way to write a relatively large piece of software, like a markdown parser (when I wrote md2html.awk in 2009, the standard md parser was the original one by John Gruber, written in Perl, so it actually was an improvement in code clarity, performance, and portability (we had no perl in Plan 9!), but nowadays it is easy to find much better solutions).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • tbl2bed

    Convert a feature table into a bed file

  • Glad I’m not the only one crazy enough to write long AWK parsers. Here’s my tool to convert “feature tables” into “bed” files, both files that describe genomes, but for some reason, NCBI uses the former, even though it’s totally useless. https://github.com/ryandward/tbl2bed/blob/main/tbl2bed.awk

  • diff2html

    A script that employs awk and bash to html output of diff between 2 files quickly* (by berry-thawson)

  • https://github.com/berry-thawson/diff2html

    This is first attempt in writing awk script. Would like to know how readable it is.

  • Onigmo

    Onigmo is a regular expressions library forked from Oniguruma.

  • >Ruby even supports Perl regular expressions

    No, Ruby Regexp is based on the https://github.com/k-takata/Onigmo library. There are plenty of differences compared to Perl, for example `^` and `$` anchors always match start/end of lines without needing a flag, subexpression syntax uses `\g` instead of `(?N)` and so on.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts