Command-line Tools can be 235x Faster than your Hadoop Cluster (2014)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • manta

    Manta is a scalable HTTP-based object store (by TritonDataCenter)

  • These posts always remind me of the [Manta Object Storage](https://www.tritondatacenter.com/triton/object-storage) project by Joyent. This project was basically a combination of object storage with the added ability to run arbitrary programs against your data in situ. The primary, and key, difference being that you kept the data in place and distributed the program to the data storage nodes (the opposite of most data processing as I understand it), I think of this as a superpowered version of using [pssh](https://linux.die.net/man/1/pssh) to grep logs across a datacenter. Yet another idea before its time. Luckily, Joyent [open sourced](https://github.com/TritonDataCenter/manta) the work, but the fact that it still hasn't caught on as "The Way" is telling.

    Some of the projects I remember from the Joyent team were: dumping recordings of local mariokart games to manta and running analytics on the raw video to generate office kart racer stats, the bog standard dump all the logs and map/reduce/grep/count them, and I think there was one about running mdb postmortems on terabytes of core dumps.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Do you donate your CPU time, storage, or bandwidth from your homelab to any altruistic purpose?

    1 project | /r/homelab | 16 Oct 2023
  • About new Session encryption protocol...

    1 project | /r/Session_Messenger | 6 Aug 2023
  • NFT Payload Storage Options

    2 projects | dev.to | 24 Jul 2023
  • Cloud Storage for Back Up

    1 project | /r/DataHoarder | 20 May 2023
  • Another Dead 5n2/soldier

    1 project | /r/drobo | 30 Mar 2023