re2c
CyberChef
re2c | CyberChef | |
---|---|---|
12 | 286 | |
1,026 | 25,733 | |
- | 2.8% | |
6.8 | 9.3 | |
18 days ago | 4 days ago | |
C | JavaScript | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
re2c
-
Ask HN: What are some unpopular technologies you wish people knew more about?
(1) Zulip Chat - https://zulip.com/ - seems to be reasonably popular, but more people should know about it
I’ve been using it for over 5 years now [1], and it’s as good as ever. It’s way faster than any other chat app I’ve used. It has a good UI and conversation model. It has a simple and functional API that lets me curl threads and write blog posts based on them.
(only problem is that I Ctrl-+ in my browser to make the font bigger – I think it’s too dense for most people)
(2) re2c regex to state machine compiler - https://re2c.org
A gem from the 90’s, which people have done a great job maintaining and improving (getting Go and Rust target support in the last few years). I started using it in 2016, and used for a new program a few months ago. I came to the conclusion that it should have been built into C, because C has shitty string processing – and Ken Thompson both invented C AND brought regular languages to computing !!
In comparison, treesitter lexers are very low level, fiddly, and error prone. I recently saw dozens of ad hoc fixes to the tree-sitter-bash lexer, which is unsurprising if you look at the structure of the code (manually crawling through backslashes and braces in C).
https://github.com/tree-sitter/tree-sitter-bash/blob/master/...
These fixes are definitely appreciated, but I think it indicates a problem with the model itself.
(based on https://lobste.rs/s/endspx/software_you_are_thankful_for#c_y...)
[1] https://www.oilshell.org/blog/2018/04/26.html
-
Irregular Expressions
The "Papers" section on re2c's web site continues Laurikari's work: http://re2c.org/
... but I haven't found them particularly accessible. And it's not clear it's a viable strategy in a general purpose regex engine. Namely, I'm not sure how much bigger it makes the DFA.
Also, AFAIK, these are DFAs. They are different theoretical structures with explicitly more power.
> and then an NDFA is used to match a third time, to extract the capture groups.
That's the PikeVM. It's an NFA simulation. Although it uses additional storage and is otherwise more computationally powerful than just a plain NFA.
-
My experience crafting an interpreter with Rust (2021)
> What do you gain by using it?
Performance, although this possibly depends on your compiler, whether you use PGO, and similar finicky issues.
Example: https://eli.thegreenplace.net/2012/07/12/computed-goto-for-e...
Some prior HN discussion: https://news.ycombinator.com/item?id=18678920
Another example where goto is relevant is implementing finite automata. A (very short) paper from 1988 that discusses three different ways of implementing a finite state machine is "How (Not) to Code a Finite State Machine". The documentation of RE2C may be even more interesting: https://re2c.org
RE2C is a program that compiles finite automata into C, Go, or Rust code. It provides many implementation strategies: it can make use of computed or labelled gotos when the language provides them.
Implementing pushdown automata comes with similar issues.
-
How to compile DPDK-22.11.1
wget https://github.com/skvadrik/re2c/releases/download/1.0.3/re2c-1.0.3.tar.gz tar -zxvf re2c-1.0.3.tar.gz cd re2c-1.0.3/ ./configure make make install
-
Best approach for writing a lexer
In Rust I use https://docs.rs/logos/latest/logos/. I think another similar is http://re2c.org
- re2c is a free and open-source lexer generator for C/C++, Go and Rust
-
File parsing with PHP, Bison and re2c
re2c is an open-source lexer generator. It uses regular expressions to recognize tokens.
-
Best option for Rust Parser and Lexer Generators?
Those suggested crates are still more or less the popular options. There was also recently added support for Rust in re2c.
- How Does One Develop the Grammar for their New Language
-
Javascript Date String Parsing
First, the implementation of strtotime is a textbook study in why other people's C code is not where you want to spend time. You can see the guts of the implementation logic here. This isn't stock C code -- it's code for a system called re2c. This system allows you to write regular expressions in a custom DSL (domain specific language), and then transform/compile those regular expressions down to C programs (also C++ and Go) that will execute those regular expressions. Something in PHP's make file uses this parse_date.re file to generate parse_date.c. If you don't realize parse_date.c is a generated file, this can be extremely rough going. If you've not familiar with re2c is can be regular rough going. We leave further exploration as an exercise for the reader -- an exercise we haven't taken ourself.
CyberChef
-
PicoCTF 2024: packer
Then we take the encrypted text and use CyberChef to decrypt it.
-
Unbreakable 2024: secrets-of-winter
Let's go to CyberChef and insert our pieces of evidence.
-
YouTube: Google has found a way to break Invidious
A parameter was changed from '2AMBCgIQBg' to 'CgIIAdgDAQ%3D%3D' which is just the correct base64 encoding they should have been using the entire time.
I don't think this was a hostile action by Google, I think someone just added better input validation for security reasons and it accidently broke the bad requests they were sending.
https://gchq.github.io/CyberChef/#recipe=URL_Decode()From_Ba...
-
PicoCTF 2024- CanYouSee
❗This is indeed the flag, but the text is encrypted with Base64. Usually, the presence of padding character "=" indicates that's Base64 type of encoding (but that's only one of the hints). To decrypt it, we can use CyberChef. Copy-paste the text and we either:
-
CyberChef VS DevToolboxWeb - a user suggested alternative
2 projects | 6 Feb 2024
-
CyberChef from GCHQ: The Cyber Swiss Army Knife
It uses a combination of magic bytes (like the `file` command), entropy analysis and character frequency detection to determine whether an output is likely to be of interest to the user.
The file type mechanism is written here[0]. There's a list of all signatures we detect here[1].
[0] https://github.com/gchq/CyberChef/blob/master/src/core/lib/F...
- Show HN: File Hider
- UK GCHQ's CyberChef
-
Lets try this again. Got a code for you to break.
I think this can be deciphered using CyberChef...
- CyberChef is a useful tool for decoding information.
What are some alternatives?
parser-demo - Good source layout with Flex and Bison
QR-Code-generator - High-quality QR Code generator library in Java, TypeScript/JavaScript, Python, Rust, C++, C.
Luxon - ⏱ A library for working with dates and times in JS
CapRover - Scalable PaaS (automated Docker+nginx) - aka Heroku on Steroids
cmark - CommonMark parsing and rendering library and program in C
py4e - Web site for www.py4e.com and source to the Python 3.0 textbook
lowdown - simple markdown translator
cyberchef-recipes - A list of cyber-chef recipes and curated links
moment - Parse, validate, manipulate, and display dates in javascript.
Ciphey - ⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡
plex - a parser and lexer generator as a Rust procedural macro
Monica - Personal CRM. Remember everything about your friends, family and business relationships.