re2c
datasette
re2c | datasette | |
---|---|---|
12 | 187 | |
1,026 | 8,963 | |
- | - | |
6.8 | 9.3 | |
18 days ago | 5 days ago | |
C | Python | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
re2c
-
Ask HN: What are some unpopular technologies you wish people knew more about?
(1) Zulip Chat - https://zulip.com/ - seems to be reasonably popular, but more people should know about it
I’ve been using it for over 5 years now [1], and it’s as good as ever. It’s way faster than any other chat app I’ve used. It has a good UI and conversation model. It has a simple and functional API that lets me curl threads and write blog posts based on them.
(only problem is that I Ctrl-+ in my browser to make the font bigger – I think it’s too dense for most people)
(2) re2c regex to state machine compiler - https://re2c.org
A gem from the 90’s, which people have done a great job maintaining and improving (getting Go and Rust target support in the last few years). I started using it in 2016, and used for a new program a few months ago. I came to the conclusion that it should have been built into C, because C has shitty string processing – and Ken Thompson both invented C AND brought regular languages to computing !!
In comparison, treesitter lexers are very low level, fiddly, and error prone. I recently saw dozens of ad hoc fixes to the tree-sitter-bash lexer, which is unsurprising if you look at the structure of the code (manually crawling through backslashes and braces in C).
https://github.com/tree-sitter/tree-sitter-bash/blob/master/...
These fixes are definitely appreciated, but I think it indicates a problem with the model itself.
(based on https://lobste.rs/s/endspx/software_you_are_thankful_for#c_y...)
[1] https://www.oilshell.org/blog/2018/04/26.html
-
Irregular Expressions
The "Papers" section on re2c's web site continues Laurikari's work: http://re2c.org/
... but I haven't found them particularly accessible. And it's not clear it's a viable strategy in a general purpose regex engine. Namely, I'm not sure how much bigger it makes the DFA.
Also, AFAIK, these are DFAs. They are different theoretical structures with explicitly more power.
> and then an NDFA is used to match a third time, to extract the capture groups.
That's the PikeVM. It's an NFA simulation. Although it uses additional storage and is otherwise more computationally powerful than just a plain NFA.
-
My experience crafting an interpreter with Rust (2021)
> What do you gain by using it?
Performance, although this possibly depends on your compiler, whether you use PGO, and similar finicky issues.
Example: https://eli.thegreenplace.net/2012/07/12/computed-goto-for-e...
Some prior HN discussion: https://news.ycombinator.com/item?id=18678920
Another example where goto is relevant is implementing finite automata. A (very short) paper from 1988 that discusses three different ways of implementing a finite state machine is "How (Not) to Code a Finite State Machine". The documentation of RE2C may be even more interesting: https://re2c.org
RE2C is a program that compiles finite automata into C, Go, or Rust code. It provides many implementation strategies: it can make use of computed or labelled gotos when the language provides them.
Implementing pushdown automata comes with similar issues.
-
How to compile DPDK-22.11.1
wget https://github.com/skvadrik/re2c/releases/download/1.0.3/re2c-1.0.3.tar.gz tar -zxvf re2c-1.0.3.tar.gz cd re2c-1.0.3/ ./configure make make install
-
Best approach for writing a lexer
In Rust I use https://docs.rs/logos/latest/logos/. I think another similar is http://re2c.org
- re2c is a free and open-source lexer generator for C/C++, Go and Rust
-
File parsing with PHP, Bison and re2c
re2c is an open-source lexer generator. It uses regular expressions to recognize tokens.
-
Best option for Rust Parser and Lexer Generators?
Those suggested crates are still more or less the popular options. There was also recently added support for Rust in re2c.
- How Does One Develop the Grammar for their New Language
-
Javascript Date String Parsing
First, the implementation of strtotime is a textbook study in why other people's C code is not where you want to spend time. You can see the guts of the implementation logic here. This isn't stock C code -- it's code for a system called re2c. This system allows you to write regular expressions in a custom DSL (domain specific language), and then transform/compile those regular expressions down to C programs (also C++ and Go) that will execute those regular expressions. Something in PHP's make file uses this parse_date.re file to generate parse_date.c. If you don't realize parse_date.c is a generated file, this can be extremely rough going. If you've not familiar with re2c is can be regular rough going. We leave further exploration as an exercise for the reader -- an exercise we haven't taken ourself.
datasette
-
Ask HN: High quality Python scripts or small libraries to learn from
Simon Willison's github would be a great place to get started imo -
https://github.com/simonw/datasette
- Show HN: TextQuery – Query and Visualize Your CSV Data in Minutes
-
Little Data: How do we query personal data? (2013)
I'm a fan on simonw's datasette/dogsheep ecosystem https://datasette.io/
-
LaTeX and Neovim for technical note-taking
I use Anki the exact same way. After a lifetime of learning I have accepted that I will never read over anything I write for myself voluntarily - so my two options are:
1. Write an article so good I can publish it and look it over myself later on. I did this last year with https://andrew-quinn.me/fzf/, for example.
2. Create Anki cards out of the material. Use the builtin Card Browser or even https://datasette.io/ on the underlying SQLite database in a pinch to search for my notes any time I have to.
-
Daily Price Tracking for Trader Joes
Were you aware of, or tempted by https://datasette.io/ for creating your solution?
- SQLite-Web: Web-based SQLite database browser written in Python
-
Ask HN: What two software products should have a kid?
Browsing HN, GitHub and the like we get to see a huge variety of software products and code bases.
I often see products and think - if this product X, got together with Y, it would be pretty cool - kind of like if they had a kid together.
Not too literally, but more on the conceptual level - my level of programming is low.
E.g. Just some....
- pocketable.io & datasette (+with some more charting) [https://pocketbase.io, https://datasette.io]
-
Ask HN: Looking for a project to volunteer on? (February 2024)
You might like the Datasette project: https://datasette.io/
I don't think they are desperate for contributions but it's a welcoming environment and a fun project to hack on. You'll learn a lot just from reading the source and the incredibly informative PRs. The creator is a really talented developer with a great blog which shows up on the HN front page often.
-
Stuff I Learned during Hanukkah of Data 2023
Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.
-
What We Watched: A Netflix Engagement Report – About Netflix
> uploads of boring raw excel data and receive a nice UI
https://datasette.io/
What are some alternatives?
parser-demo - Good source layout with Flex and Bison
nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
Luxon - ⏱ A library for working with dates and times in JS
duckdb - DuckDB is an in-process SQL OLAP Database Management System
cmark - CommonMark parsing and rendering library and program in C
sql.js-httpvfs - Hosting read-only SQLite databases on static file hosters like Github Pages
lowdown - simple markdown translator
litestream - Streaming replication for SQLite.
moment - Parse, validate, manipulate, and display dates in javascript.
Sequel-Ace - MySQL/MariaDB database management for macOS
plex - a parser and lexer generator as a Rust procedural macro
beekeeper-studio - Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.