datastation
miller
datastation | miller | |
---|---|---|
25 | 63 | |
2,854 | 8,571 | |
0.1% | - | |
0.0 | 9.0 | |
6 months ago | 7 days ago | |
TypeScript | Go | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
datastation
-
Code coverage for Go integration tests
There was a technique that existed already where you could use `go test -cover` and the `-o` flag to produce a binary from `go test` rather than actually running tests. So you could build a binary that had coverage enabled. Then when you ran
Here's an example: https://github.com/multiprocessio/datastation/blob/main/runn....
I can't remember where I found this technique but it's been around for a while.
This new option is the same thing but a way to `go build` with `-cover` instead of `go test -cover -o $out`? Do I have that right?
-
Engineers using dbt with VS Code - how are you previewing your results in lieu of the functionality provided by dbt cloud?
If my employer doesn't consider paying for dbt cloud, I will use u/eatonphil 's datastation, run the queries on a dev database then put them in dbt.
- Show HN: DataStation – App to easily query, script, and visualize data
-
Windmill.dev
I build a somewhat similar app, DataStation [0], that is in JavaScript and Go. It supports scripting in Python, Julia, R, JavaScript, Ruby, etc.
The server version of it exists and I run it myself but that process is not documented yet. (Most people use it as a desktop app today.)
[0] https://github.com/multiprocessio/datastation
-
Datasette Lite: a server-side Python web application running in a browser
My biggest issue with Pyodide is the long wait times. I haven't figured out a way around a ~5 second load time where the entire UI hangs every single time you load the page.
My app (similar to Simon's, a lite mode of a data IDE): https://app.datastation.multiprocess.io.
My code: https://github.com/multiprocessio/datastation/blob/main/shar....
-
Lies we tell ourselves to keep using Golang
I use Go heavily cross-platform developing DataStation [0] and dsq [1]. I am not an expert. And I don't have proof for it but on some rudimentary benchmarks the Linux-specific file idioms in the Go standard library definitely don't seem to translate well to even macOS let alone Windows. For example some good streaming techniques for reading large files on Linux that work really well there seemed to be pretty bad on macOS.
I think Amos has presented more proof than I can on the topic of just how Linux-influenced Go is. And I think it is fine for the majority of Go users because the majority users of Go are building server apps or Linux CLIs.
Amos has spent some time building cross-platform desktop systems with Go for itch.io and I think I'm seeing some of the same things they are in that scenario.
I think this is a reasonable article. If Amos gets flame-y at any point I think it's worth ignoring because there does seem to be something up with Go in cross-platform applications.
I like Go a lot and for most things I'd keep using it still. Just sharing some observations.
[0] https://github.com/multiprocessio/datastation
[1] https://github.com/multiprocessio/dsq
-
Feeling overwhelmed when trying to contribute to opensource projects
I keep a page of good first projects for two big projects I work on. The only expectation is that you know Go. I've had a couple of people who've never contributed to OSS come in and get some meaningful features merged.
-
Ask HN: Who wants to collaborate? (April 2022)
I've got some good first projects if you're interested in OSS data tools and have some Go experience.
Check out: https://github.com/multiprocessio/datastation/blob/main/GOOD...
-
Open source Go projects to contribute (beginners)
Some example projects: DataStation (desktop GUI for querying every kind of database, scripting and graphing the results) and dsq (a CLI companion for running SQL queries on many kinds of files), and go-json (a library for fast JSON encoding of arrays of large objects).
-
Ask HN: Anyone making a living building desktop applications?
I'm building a desktop-first (SaaS-eventual) data IDE for developers [0]. Making a living? Not yet.
It being desktop-first makes it as easy to try out in a corporate environment as Sublime. The data never leaves your machine. Desktop-first is a big deal in devtools for this reason.
[0] https://github.com/multiprocessio/datastation
miller
- Qsv: Efficient CSV CLI Toolkit
-
jq 1.7 Released
jq and miller[1] are essential parts of my toolbelt, right up there with awk and vim.
[1]: https://github.com/johnkerl/miller
-
Perl first commit: a “replacement” for Awk and sed
> This works really well if your problem can be solved in one or two liners.
My personal comfort threshold is around the 100-line mark. It's even possible to write maintainable shell scripts up to 500 lines, but it mostly depends on the problem you're trying to solve, and the discipline of the programmer to follow best practices (use sane defaults, ShellCheck, etc.).
> It go bad very quickly when, say, you have two CSV files and want to join them the sql-way.
In that case we're talking about structured data, and, yeah, Perl or Python would be easier to work with. That said, depending on the complexity of the CSV, you can still go a long way with plain Bash with IFS/read(1) or tr(1) to split CSV columns. This wouldn't be very robust, but there are tools that handle CSV specifically[1], which can be composed in a shell script just fine.
So it's always a balancing act of being productive quickly with a shell script, or reaching out for a programming language once the tools aren't a good fit, or maintenance becomes an issue.
[1]: https://miller.readthedocs.io/
-
Need help on cleaning this data!!
where mlr is from https://github.com/johnkerl/miller
-
Running weekly average
if this class of problems (i.e., csv/tsv data) is your main target you may find miller (https://github.com/johnkerl/miller) much more useful in the long run
-
GQL: A new SQL like query language for .git files written in Rust
That said, you may be interested in Miller (https://github.com/johnkerl/miller) which provides similar capabilities for CSV, JSON, and XML files. It doesn't use a SQL grammar, but that's just the proverbial lipstick on the thing. I'm not the author, but I have used it and I see some parallels in use cases at the very least.
- johnkerl/miller: Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
-
Any cli utility to create ascii/org mode tables?
worth giving Miller a shot
-
I wrote this iCalendar (.ics) command-line utility to turn common calendar exports into more broadly compatible CSV files.
CSV utilities (still haven't pick a favorite one...): https://github.com/harelba/q https://github.com/BurntSushi/xsv https://github.com/wireservice/csvkit https://github.com/johnkerl/miller
- Miller: Like Awk, sed, cut, join, and sort for CSV, TSV, and tabular JSON
What are some alternatives?
homebrew-emacs-plus - Emacs Plus formulae for the Homebrew package manager
visidata - A terminal spreadsheet multitool for discovering and arranging data
gecko-dev - Read-only Git mirror of the Mercurial gecko repositories at https://hg.mozilla.org. How to contribute: https://firefox-source-docs.mozilla.org/contributing/contribution_quickref.html
xsv - A fast CSV command line toolkit written in Rust.
vscode-jupyter - VS Code Jupyter extension
jq - Command-line JSON processor [Moved to: https://github.com/jqlang/jq]
golang-samples - Sample apps and code written for Google Cloud in the Go programming language.
dasel - Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
datasette - An open source multi-tool for exploring and publishing data
csvtk - A cross-platform, efficient and practical CSV/TSV toolkit in Golang
oursh - Your comrade through the perilous world of UNIX.
yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor