Consider Using CSV

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • Appwrite - The Open Source Firebase alternative introduces iOS support
  • SurveyJS - Extensible JavaScript Form Builder Libraries
  • InfluxDB - Access the most powerful time series database as a service
  • tad

    A desktop application for viewing and analyzing tabular data

    Since this is about CSV, this is obligatory tool for larger ones:

    * https://github.com/antonycourtney/tad

  • KeenWrite

    Free, open-source, cross-platform desktop Markdown text editor with live preview, string interpolation, and math.

  • Appwrite

    Appwrite - The Open Source Firebase alternative introduces iOS support . Appwrite is an open source backend server that helps you build native iOS applications much faster with realtime APIs for authentication, databases, files storage, cloud functions and much more!

  • parquet-go

    pure golang library for reading/writing parquet file

    > It's so complex to work with, that unless you're specifically in data science, it's both unheard of and unusable.

    FWIW, in my experience at a "data analytics platform" company, it's reasonably popular for data-heavy workflows since Because Parquet is well-defined, and file sizes are a fraction of their CSV equivalents.

    > Is it a limitation of the format itself?

    I don't think so. In other languages, you can generally read/write Parquet files without a ton of dependencies (e.g. https://github.com/xitongsys/parquet-go).

  • js-bson

    BSON Parser for node and browser

  • ndjson.github.io

    Info Website for NDJSON

    No one uses that format for streamed json, see ndson and jsonl

    http://ndjson.org/

    The size complaint is overblown, as repeated fields are compressed away.

    As other folks rightfully commented, csv is a mine field. One should assume every CSV file is broken in some way. They also don't enumerate any of the downsides of CSV.

    What people should consider is using formats like Avro or Parquet that carry their schema with them so the data can be loaded and analyzed without have to manually deal with column meaning.

  • xsv

    A fast CSV command line toolkit written in Rust.

    For manipulating CSV from the terminal, check out https://github.com/BurntSushi/xsv

  • bsv

    maximum performance data processing (by nathants)

    i had a lot of fun exploring the performance ceiling of csv and csv like formats. turns out binary encoding of size prefixed byte arrays is fast[1].

    csv is just a sequence of 2d byte arrays. probably avoid if dealing with heterogeneous external data. possibly use if dealing with homogeneous internal data.

    https://github.com/nathants/bsv

  • SurveyJS

    Extensible JavaScript Form Builder Libraries. SurveyJS is a set of four fully customizable JS libraries that allow you to create, easily modify, and run multiple web forms in any web app, while retaining all sensitive data on your own servers.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts