Building an open data pipeline in 2024

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • xrootd

    The XRootD central repository https://my.cdash.org/index.php?project=XRootD

  • I take issue with this part of the article:

    > In general, managed tools will give you stronger governance and access controls compared to open source solutions. For businesses dealing with sensitive data that requires a robust security model, commercial solutions may be worth investing in, as they can provide an added layer of reassurance and a stronger audit trail.

    There are definitely open source solutions capable of managing vast amounts of data securely. The storage group at CERN develops EOS (a distributed filesystem based on the XRootD framework), and CERNBox, which puts a nice web interface on top. See https://github.com/xrootd/xrootd and https://github.com/cern-eos/eos for more information.

  • eos

    EOS Storage (by cern-eos)

  • I take issue with this part of the article:

    > In general, managed tools will give you stronger governance and access controls compared to open source solutions. For businesses dealing with sensitive data that requires a robust security model, commercial solutions may be worth investing in, as they can provide an added layer of reassurance and a stronger audit trail.

    There are definitely open source solutions capable of managing vast amounts of data securely. The storage group at CERN develops EOS (a distributed filesystem based on the XRootD framework), and CERNBox, which puts a nice web interface on top. See https://github.com/xrootd/xrootd and https://github.com/cern-eos/eos for more information.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Cmkr – a modern build system based on CMake and TOML

    2 projects | news.ycombinator.com | 6 May 2024
  • IBM Granite: A Family of Open Foundation Models for Code Intelligence

    3 projects | news.ycombinator.com | 7 May 2024
  • Dear ImGui version v1.90.6 released

    1 project | news.ycombinator.com | 8 May 2024
  • Qt and C++ Trivial Relocation (Part 1)

    3 projects | news.ycombinator.com | 7 May 2024
  • Kubernetes and Docker Desktop for Fast Local Development

    2 projects | dev.to | 8 May 2024