csvquote
awk
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
csvquote
- csvquote – smart and simple CSV processing on the command line
-
Understanding Awk
There is a small program I wrote called csvquote[1] that can be used to sanitize input to awk so it can rely on delimiter characters (commas) to always mean delimiters. The results from awk then get piped through the same program at the end to restore the commas inside the field values.
Also works for other text processing tools like cut, sed, sort, etc.
[1] https://github.com/dbro/csvquote
-
Awk: The Power and Promise of a 40-Year-Old Language
CSVs with quoted fields and imbedded newlines can be troublesome in awk. Years ago I had found a script that worked for me, I'm not sure but I think it was this:
http://lorance.freeshell.org/csv/
There's also https://github.com/dbro/csvquote which is more unix-like in philosophy: it only handles transforming the CVS data into something that awk (or other utilities) can more easily deal with. I haven't used it but will probably try it next time I need something like that.
awk
-
Awk: Power and Promise of a 40 yr old language (2021)
Yep, functions! I used to write a fair amount of Awk code back in the late '80s and early '90s. I treated Awk as a "real" programming language and tried to make the code nice and readable. This of course involved a lot of use of functions.
I only have a couple of surviving examples of the code from back then, but here they are for the curious:
https://github.com/geary/awk
LJPII.AWK is probably the best example. It made a nicely formatted printout of source code on my HP LaserJet II printer. I wish I had one of the printouts it generated, but they are long gone.
Hmm... I wonder if my Brother printer supports the old LaserJet II control codes? Or maybe there is an emulator online?
The code was written for Thompson Awk (TAWK), so some bits would need to be adapted to modern Awks.
-
Understanding Awk
I used to love Awk! I still do, even if I don't use it much any more.
Awk has a reputation for being hard to read (as noted in stevebmark's comment), but when I was using it actively, I tried to treat it as a serious programming language and write readable programs in it.
Several years ago I tracked down a couple of my old Awk programs from around 1990 and posted them here:
https://github.com/geary/awk
SHANEY.AWK is an implementation of the infamous Mark V. Shaney:
https://www.clear.rice.edu/comp200/09fall/textriff/sci_am_pa...
This was probably the first program that made me really impressed with Awk. People were writing rather complicated Shaney implementations in C, and I thought, "this could be really simple in Awk." And it was!
LJPII.AWK is the Awk program I'm most proud of. This was in the days when we had tiny screens and no multiple monitors and you always printed out your code to read it. In my circles we also fond of inserting "separator lines" between functions, in various formats such as this one:
// - - - - - - - - - - - - - - - - - -
What are some alternatives?
csvinfo - A small util to show max column lengths for a passed CSV file.
busybox-w32 - WIN32 native port of BusyBox.
Awk-Batteries - Public AWK Directory
cligen - Nim library to infer/generate command-line-interfaces / option / argument parsing; Docs at
postgres - Docker Official Image packaging for Postgres
bioawk - BWK awk modified for biological data
frawk - an efficient awk-like language
mkmcsv - Command-line utility for processing CSV files exported from Cardmarket.
bashbrew - Canonical build tool for the official images