TV is a cross-platform CSV pretty printer made to maximize viewer enjoyment

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • tidy-viewer

    📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.

  • First of all - kudos on tackling this task - it is indeed very annoying to get CSVs to render nicely on a terminal.

    > How does tidy-viewer compare with csvlook?

    The most important issue to me is that csvlook is a much less pleasant viewing experience, but there is also this ...csvlook reads and parses all of the data. Try pushing diamonds.csv to csvlook. When I do it on my machine it takes 15.228 seconds while tv takes 0.0042 seconds. For this reason tv is much faster, but speed is not the goal of the package. tv's purpose is to maximize viewer enjoyment.

    2. Looking at the demo video, there seems to be an odd fixation with "N/A". The CSV spec, AFAIK, doesn't recognize this phrase. I don't understand why someone would expect a quoted string field whose raw characters are "n/a" should be rendered as anything other than n/a (i.e. lowercase and without the quotes). I'm guessing maybe in your workflow you want to use that phrase a lot, but for a tool for the general public I'd not do this kind of interpretation; and I would leave an empty field as empty.

    I could not say it better than this:

    > The norm of treating missing data as NA exists in R (which the developer of this is clearly inspired by based on the GitHub readme.). Pandas in Python is stuck with NaN for numeric types (not quite correct) and "" or None for string types. Personally I like the choice to both explicitly render missing data in colour and to apply NA as a placeholder text to display that colour.

    3. tidy-viewer seems to require "unstable library features", or at least ones which were unstable as of Rust 1.48.0 . It would be nice if you could be compatible with older rust distributions/versions.

    That is a good point. I also release binaries which I think makes this requirement less needed. What are your thoughts.

    4. Many systems, especially older ones, especially ones which you access remotely and don't have root privileges on, won't have a rust installation. It would be even more convenient if you could provide binaries with little or no extra dynamic library dependencies, which could be used on older / rustless systems. I realize this is a tall order, however.

    With github actions I auto-build binaries for many OSes. See https://github.com/alexhallam/tv/releases/tag/0.0.13

    5. What about scrolling? The worst part of viewing CSVs is having to handle wide ones which exceed the terminal width, and having decent horizontal as well as vertical scrolling ability. less doesn't cut it, because it doesn't keep the header row, plus it doesn't recognize field widths.

    Scrolling is nice. To offer scrolling the only option I am aware of is turning this cli into a tui. I made the choice early on to stay chose the more minimal path and stick to a cli. The goal is to be a `column` replacement not a spreadsheet replacement.

    6. tidy-viewer does not seem to support wrapping longer fields onto multiple terminal lines.

    The goal is to glance at the data as a whole not a cell or fields. If there are cells with long text they get cut at 20 characters. I like this a lot. I would prefer to know that there is a lot of text that I can dig into latter, but when I am glancing at the csv I just want an overall picture. In my view tables of data are data visualizations meaning that I don't have to show everything to understand enough of it.

    7. When the user doesn't specify the color scheme, are you choosing one based on the terminal colors, or are you using absolute color values? I suggest the former.

    Great question. I want to eventually add the ability for users to make a config file will their own colors. At this time I just have absolute presets. If you are interested I would happily take a contribution that allows users the option to configure tv with some dotfile.

    8. tidy-viewer loads and parses the entire CSV immediately; and, in fact, seems to keep two copies of it in memory at once. This means it cannot be used with large files without thrashing; and even if your CSV does fit in global memory, it will still be kind of unusable, trying to dump gigabytes onto the terminal.

    That is almost true. tidy-viewer reads the entire csv, but only parses the head. If I knew of a way to get the number of rows and columns of a csv without reading the whole file then I would. I know there is a good deal more room for memory optimization. This is not my strength and I am still learning.

    9. Bottom line: A nice initial effort, but the more serious challenges are yet to be tackled, plus needs to be more robustly cross-platform.

    Thanks for the compliment. It is still a work in progress.

  • tac

    A high-performance, cross-platform file reverse utility (by neosmart)

  • cat just regurgitates the contents of the file, but the resulting piped fd is non-seekable. Since almost every command that can operate on a file from stdin can also operate on the file by name/path, at best this is just a needless invocation of a process (i.e. `tv foo.csv` should have been used instead of `cat foo.csv | tv`) - if the app in question can't handle paths, then you can have the shell pipe the file into it instead (e.g. `tv < foo.csv`). At worst, the recipient program would need to buffer the entire contents of the input if it needs to perform non-sequential operations on the source data - this is the case with things like `tac` that need to seek to the end of the input (see https://github.com/neosmart/tac for how `cat foo | tac` requires buffering but both `tac foo` and even `tac < foo` don't).

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • csv123

    CSV 1-2-3 - A CLI viewer for .csv files

  • I've made a Lotus 1-2-3 inspired CSV viewer for the terminal too. Had big plans for it, but it's just a basic viewer now:

    https://github.com/evert/csv123

  • xsv

    A fast CSV command line toolkit written in Rust.

  • XSV [0] can also pretty-print (minus the colors), but that's just the tip of the iceberg as far as what it can do. It's very handle for quick statistical analysis of CSV input.

    [0]: https://github.com/BurntSushi/xsv

  • ngrid

    It's "less" for data!

  • A while ago, Two Sigma Investments open-sourced its own curses-based internal tool for pretty printing tabular data: https://github.com/twosigma/ngrid

  • murex

    A smarter shell and scripting environment with advanced features designed for usability, safety and productivity (eg smarter DevOps tooling)

  • Shameless plug, but so does my shell, https://github.com/lmorg/murex

      $ open test/example.csv | format generic

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts