We Need to Know LR and Recursive Descent Parsing Techniques

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • diagnose

    A simple library for reporting compiler/interpreter errors

  • TameParse

    LALR parser with context-sensitive extensions

  • I wrote a parser generator quite a long time ago that I think improves the syntax quite a lot, and which has an interesting approach to generalisation: you can write conditions on the lookahead (which are just grammars that need to be matched in order to pick a given rule when a conflict needs to be resolved). This construct makes it much easier to write a grammar that matches how a language is designed.

    Here's an ANSI-C parser, for example: https://github.com/Logicalshift/TameParse/blob/master/Exampl... - this is an interesting example because `(foo)(bar)` is fully ambiguous in ANSI C: it can be a cast or a function call depending on if `foo` is a type or a variable.

    The new construct makes it possible to extend grammars and disambiguate them - here's a C99 grammar that extends the ANSI-C grammar: https://github.com/Logicalshift/TameParse/blob/master/Exampl....

    It also allows matching at least some context-sensitive languages - see https://github.com/Logicalshift/TameParse/blob/master/Exampl...

    An advantage over GLR or backtracking approaches is that this still detects ambiguities in the language so it's much easier to write a grammar that doesn't end up running in exponential time or space, plus when an ambiguity is resolved by the generalisation, which version is specified by the grammar and is not arbitrary (backtracking) or left until later (GLR).

    I was working on improving error handling when I stopped work on this, but my approach here was not working out.

    (This is a long-abandoned project of mine but the approach to ambiguities and the syntax seem like they're novel to me and were definitely an improvement over anything else I found at the time. The lexer language has a few neat features in it too)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • scroll

    Tools for thought. A language for bloggers. This repo contains the language and a static site generator command line app.

  • > Context-free grammars, and their associated parsing techniques, don't align well with real-world compilers, and thus we should deemphasise CFGs (Context-Free Grammars) and their associated parsing algorithms.

    I think CFG are highly overrated. Top down recursive descent parsers are simple and allow you to craft more human languages. I think building top down parsers is something every dev should do. It's a simple technique with tremendous power.

    I think the source code for Scroll (https://github.com/breck7/scroll/tree/main/grammar) demonstrates how liberating moving away from CFGs can be. Easy to extend, compose, build new backends, debug, et cetera. Parser, compiler, and interpreter for each node all in one place. Swap nodes around between languages. Great evolutionary characteristics.

    I'll stop there (realizing I need to improve the docs and write a blog post).

  • syntacs

    Syntacs Translation Toolkit

  • As a self-taught programmer with no formal education, I figured writing a lexer/parser toolkit would be a good way to bootstrap my CS knowledge (about 20 years ago). I went down the rabbit hole for months on it, really glad I did.

    I can totally relate to the authors interest in error correction of LR parsers, it's a fascinating topic that I also was nerd-sniped by at the time: https://github.com/inxar/syntacs/blob/6461578a04d3b0fda5af20...

  • oil

    Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!

  • I agree with 90%-95% of the article. I would phrase it more pithily as:

    1. Use recursive descent for recognizing existing languages, where you have a good test corpus. (And very good point about testing INVALID inputs too!)

    2. Use grammars to design new languages

    I follow this with https://www.oilshell.org/ -- the compatible OSH is a huge recursive descent parser, and Oil is designed with a grammar.

    Guy Steele and Russ Cox also mentioned this methodology -- design the language with a grammar, and then implement it a second time by hand. Then you REALLY know it! :)

    It is definitely possible to not know the language you're implementing / creating!

    ---

    As an undergrad, I never tried to design a language, or even thought about it honestly. But I agree that undergrads should know of both approaches, without necessarily getting into the details of CFGs.

    It would be a mistake to just teach recursive descent.

  • antlr4ts

    Optimized TypeScript target for ANTLR 4 (by dberlin)

  • I implemented it for ANTLR (which is LL(*), so not fully recursive descent, but top-down) but never got around to shepherding the patches through.

    https://github.com/dberlin/antlr4ts/blob/incremental/Increme...

    and the commits on that branch should give you some idea.

    I'm happy to chat about it elsewhere as well, drop me a line at my email.

  • gcc

    Docker Official Image packaging for gcc (by docker-library)

  • You're misinformed. clang also has a custom parser. Both gcc and clang have great support for new standards, though as those standards are evolving some features aren't there yet (but none of the missing features have anything to do with parsing difficulties). You can find lots of detail on feature coverage at https://gcc.gnu.org .

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [OC] Cancer in the United States: Heatmap Visualizations

    3 projects | /r/dataisbeautiful | 15 Mar 2023
  • Ask HN: What are you building that is taking multiple years to make usable?

    1 project | news.ycombinator.com | 11 Feb 2023
  • Ask HN: With recent layoffs, how would you advise new grads entering the market?

    1 project | news.ycombinator.com | 18 Jan 2023
  • Anyone interested in starting a local newspaper using new tech?

    2 projects | /r/Entrepreneur | 18 Jan 2023
  • I am building a new kind of newspaper and so have been collecting and studying old newspapers. Here is one from my collection, an issue of the Columbian Centinel (Boston), from 1795, when George Washington was president. The classifieds make me laugh. Lots of Schooners for sale.

    3 projects | /r/Journalism | 16 Jan 2023