goawk
qsv
Our great sponsors
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
goawk
- GoAWK, an Awk interpreter written in Go (2018)
-
The Awk Programming Language, Second Edition
TIL: GoAWK [1] - A POSIX-compliant AWK interpreter written in Go, with CSV support.
[1]: https://github.com/benhoyt/goawk
- Looking for a script for csv file
-
Anyone else doing compiler work in Golang?
Another nice project that I have used from time to time (and a very good source for insight) is the awk interpreter written in go https://github.com/benhoyt/goawk
-
Tool to interact with CSV
No, I want exactly the opposite - it should be a , b,c as a single string field containing a literal comma, and c. For example, https://github.com/benhoyt/goawk has csv support. https://github.com/benhoyt/goawk/blob/master/docs/csv.md - more info.
-
Why does awk parse '1&&x=1' as '1&&(x=1)' not '(1&&x)=1' when '&&' is high precedence than '='?
I've had a go at solving this in this PR -- feedback welcome. I don't love it, but oh well, it solves the problem at hand. Your comment pointed me in the right direction, thanks again.
-
Looking for programming languages created with Go
There are quite a few re-implementations of scripting languages like Lua in Go. I've written an AWK interpreter in Go.
-
Oracle DB support in Benthos
github.com/benhoyt/goawk -> this library lets you embed an AWK runtime in your applications, very easy to use and useful for enabling some powerful scripting in things you build
-
Brian Kernighan adds Unicode support to Awk (May, 2022)
Yes, that's right. With my simplistic UTF-8-based implementation it turned length() -- for example -- from O(1) to O(N), turning O(N) algorithms which use length() into O(N^2). See this issue: https://github.com/benhoyt/goawk/issues/93
Similar with substr() and other string functions, which when operating as bytes are O(1), but become O(N) when trying to count the number of codepoints as UTF-8.
GNU Gawk has a fancier approach, which stores strings as UTF-8 as long as it can, but converts to UTF-32 if it needs to (eg: the string is non-ASCII and you call substr).
It looks like Brian Kernighan's code has the same issue with length() and substr(). I'm going to try to email him about this, as I think it's kind of a performance blocker.
-
Ask HN: Is having a Personal blog/brand worth it for you?
I'm not sure if it was via my personal website or just my GitHub profile, but I got my current job at Canonical due to the CTO there reaching out about my GoAWK project (https://github.com/benhoyt/goawk). I get regular recruitment emails because I have my CV/resume online: most of them are very low-effort, but 1 in 20 or something are interesting emails where the recruiter has actually looked at my website and will tailor it personally. I also just enjoy technical writing, and get joy out of sharing it on HN. So it's "worth it" for me.
qsv
- FLaNK Weekly 31 December 2023
-
Qsv: Efficient CSV CLI Toolkit
Thanks for the detailed feedback @snidane!
As maintainer of qsv, here's my reply:
- Given qsv's rapid release cycle (173 releases over three years), the auto-update check is essential at the moment. Once we reach 1.0, I'll turn it off. For now, given your feedback, I've only made it check 10% of the time.
- Pivot is in the backlog and I'll be sure to add unpivot when I implement it. (https://github.com/jqnatividad/qsv/issues/799)
- I'll add a dedicated summing command with the group by (-by) and window by (-over) capability (https://github.com/jqnatividad/qsv/issues/1514). Do note that `stats` has basic sum as @ezequiel-garzon pointed out.
- With the `enum` command, qsv can achieve what you proposed with `laminate`. E.g. qsv enum --new-column newcol --constant newconstant mydata.csv --output laminated-data.csv
- With the cat rowskey command, qsv can already concatenate files with mismatched headers.
- other file formats. qsv supports parquet, csv, tsv, excel, ods, datapackage, sqlite and more (see https://github.com/jqnatividad/qsv/tree/master#file-formats). Fixed-format though is not supported yet and quite interesting, and have added it to the backlog (https://github.com/jqnatividad/qsv/issues/1515)
- as to "enable embedding outputs of commands", qsv is composable by design, so you can use standard stdin/stdout redirection/piping techniques to have it work with other CLI tools like jq, awk, etc.
Finally, just released v0.120.0 that already incorporates the less aggressive self-update check. https://github.com/jqnatividad/qsv/releases/tag/0.120.0
- Joining CSV Data Without SQL: An IP Geolocation Use Case
-
Why my favourite API is a zipfile on the European Central Bank's website
qsv [1] also has a sqlp command which lets you run Polars SQL queries (even on multiple files). Here I'll send the csv data from stdin (represented by -) and then (optionally) pipe the output to the table command for formatting. The shape of the result is also printed to stderr (the (4, 2) below).
[1] https://github.com/jqnatividad/qsv
$ echo 'Name,Department,Salary
- Qsv: Ultra-fast CSV data-wrangling toolkit
- Qsv: CSVs sliced, diced and analyzed (fork of xsv)
- Nushell.sh ls – where size > 10mb – –sort-by modified
-
Do Rust and Lua work well together?
It works quite well IMHO. Using the mlua crate, I’ve managed to integrate Luau as a very powerful data-wrangling DSL for qsv (https://github.com/jqnatividad/qsv)
-
How manipulate this CSV in Python?
Maybe this might be better done using this? https://github.com/jqnatividad/qsv
-
How to convert xslx to csv using Rust?
https://github.com/jqnatividad/qsv is another option.
What are some alternatives?
bytehound - A memory profiler for Linux.
miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
tsv-utils - eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
calamine - A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets
awka - Revive awka - Awk to C Compiler
fortune-sheet - A drop-in javascript spreadsheet library that provides rich features like Excel and Google Sheets
intellij-awk - The missing IntelliJ IDEA language support plugin for AWK
tumblelog - A static tumblelog generator available as both a Perl and Python version
csvquote
awk - One true awk
ess-view-data - View data support for ESS