parser-demo
re2c
parser-demo | re2c | |
---|---|---|
13 | 12 | |
18 | 1,020 | |
- | - | |
0.0 | 6.8 | |
almost 3 years ago | 7 days ago | |
Lex | C | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
parser-demo
-
Flex scanner memory leak help
I have a demo project that fixes a lot of the default insanities.
-
Advice for a first-time designer of my own original programming language? Presently writing the interpreter!
I have an old demo project that demonstrates most of that for flex/bison, but this is C/C++ oriented so might not be super applicable for you.
-
Should there be many hardcoded enums for terminals, non terminals, DFA, productions or a txt/csv file processing for compiler initialization? Is it bad to use global variables in the C code for compiler?
It's just unfortunate that flex and bison have bad defaults due to historical compatibility with lex and yacc. If you're using them I have a demo project that tweaks the defaults toward sanity, though I haven't updated it for recent warnings.
-
Doubt on building a tree using LEX and YACC
I made a demo project that ties them together. I'm pretty sure at least one of the fixes was due to header cycle problems, but I haven't touched it for a while. (note also that the demo was fully warning-clean at the time, but there may be new warnings since)
-
Simple question on compiler and syntax rules
I made a demo project that avoids several historical annoyances with Flex and Bison - in particular, I made it warnings-clean (though I haven't updated it recently), and I avoid global state.
-
Getting Lex + Yacc to recognize keywords
I have a demo project that enables all of the non-default options that you really should be using in all new projects. Note that the demo doesn't demonstrate meaningful grammars, just shows how to arrange the surrounding code and makefile.
-
Declaring yylex() and yyerror() in 2022
I have a demo project that twiddles most of the important knobs.
-
How to create an AST from bash in c?
Bash is a really complicated language. Redirections and such are simple; they are just tokenization, and any lexer/parser tutorial should get you there (I have a demo project using flex/bison).
-
what would you use to write a parser in 2021?
Bison can be configured to avoid every single one of those problems. I use reentrant versions (both with and without push parsing) in my demo project. I admit I didn't bother with named references or cleaning up after errors.
-
Practical parsing with Flex and Bison
a lot of the things that people hate about flex/bison is actually just their defaults for compatibility with lex/yacc, which can be changed. I have a demo project that does some of the things: https://github.com/o11c/parser-demo
re2c
-
Ask HN: What are some unpopular technologies you wish people knew more about?
(1) Zulip Chat - https://zulip.com/ - seems to be reasonably popular, but more people should know about it
I’ve been using it for over 5 years now [1], and it’s as good as ever. It’s way faster than any other chat app I’ve used. It has a good UI and conversation model. It has a simple and functional API that lets me curl threads and write blog posts based on them.
(only problem is that I Ctrl-+ in my browser to make the font bigger – I think it’s too dense for most people)
(2) re2c regex to state machine compiler - https://re2c.org
A gem from the 90’s, which people have done a great job maintaining and improving (getting Go and Rust target support in the last few years). I started using it in 2016, and used for a new program a few months ago. I came to the conclusion that it should have been built into C, because C has shitty string processing – and Ken Thompson both invented C AND brought regular languages to computing !!
In comparison, treesitter lexers are very low level, fiddly, and error prone. I recently saw dozens of ad hoc fixes to the tree-sitter-bash lexer, which is unsurprising if you look at the structure of the code (manually crawling through backslashes and braces in C).
https://github.com/tree-sitter/tree-sitter-bash/blob/master/...
These fixes are definitely appreciated, but I think it indicates a problem with the model itself.
(based on https://lobste.rs/s/endspx/software_you_are_thankful_for#c_y...)
[1] https://www.oilshell.org/blog/2018/04/26.html
-
Irregular Expressions
The "Papers" section on re2c's web site continues Laurikari's work: http://re2c.org/
... but I haven't found them particularly accessible. And it's not clear it's a viable strategy in a general purpose regex engine. Namely, I'm not sure how much bigger it makes the DFA.
Also, AFAIK, these are DFAs. They are different theoretical structures with explicitly more power.
> and then an NDFA is used to match a third time, to extract the capture groups.
That's the PikeVM. It's an NFA simulation. Although it uses additional storage and is otherwise more computationally powerful than just a plain NFA.
-
My experience crafting an interpreter with Rust (2021)
> What do you gain by using it?
Performance, although this possibly depends on your compiler, whether you use PGO, and similar finicky issues.
Example: https://eli.thegreenplace.net/2012/07/12/computed-goto-for-e...
Some prior HN discussion: https://news.ycombinator.com/item?id=18678920
Another example where goto is relevant is implementing finite automata. A (very short) paper from 1988 that discusses three different ways of implementing a finite state machine is "How (Not) to Code a Finite State Machine". The documentation of RE2C may be even more interesting: https://re2c.org
RE2C is a program that compiles finite automata into C, Go, or Rust code. It provides many implementation strategies: it can make use of computed or labelled gotos when the language provides them.
Implementing pushdown automata comes with similar issues.
-
How to compile DPDK-22.11.1
wget https://github.com/skvadrik/re2c/releases/download/1.0.3/re2c-1.0.3.tar.gz tar -zxvf re2c-1.0.3.tar.gz cd re2c-1.0.3/ ./configure make make install
-
Best approach for writing a lexer
In Rust I use https://docs.rs/logos/latest/logos/. I think another similar is http://re2c.org
- re2c is a free and open-source lexer generator for C/C++, Go and Rust
-
File parsing with PHP, Bison and re2c
re2c is an open-source lexer generator. It uses regular expressions to recognize tokens.
-
Best option for Rust Parser and Lexer Generators?
Those suggested crates are still more or less the popular options. There was also recently added support for Rust in re2c.
- How Does One Develop the Grammar for their New Language
-
Javascript Date String Parsing
First, the implementation of strtotime is a textbook study in why other people's C code is not where you want to spend time. You can see the guts of the implementation logic here. This isn't stock C code -- it's code for a system called re2c. This system allows you to write regular expressions in a custom DSL (domain specific language), and then transform/compile those regular expressions down to C programs (also C++ and Go) that will execute those regular expressions. Something in PHP's make file uses this parse_date.re file to generate parse_date.c. If you don't realize parse_date.c is a generated file, this can be extremely rough going. If you've not familiar with re2c is can be regular rough going. We leave further exploration as an exercise for the reader -- an exercise we haven't taken ourself.
What are some alternatives?
lexy - C++ parsing DSL
Luxon - ⏱ A library for working with dates and times in JS
arcsecond - ✨Zero Dependency Parser Combinator Library for JS Based on Haskell's Parsec
cmark - CommonMark parsing and rendering library and program in C
PEGTL - Parsing Expression Grammar Template Library
lowdown - simple markdown translator
oil - Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
moment - Parse, validate, manipulate, and display dates in javascript.
pcomb - parser combinators in PostScript and C
plex - a parser and lexer generator as a Rust procedural macro
mal - mal - Make a Lisp
dperf - dperf is a 100Gbps network load tester.