DAG orchestration for streaming data?

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • flow

    🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊 (by estuary)

  • This is essentially how we model things in Flow (disclosure: I work there). We call them Derivations, which are data products that are built (derived) from other data products. Each data product (we call them Collections) is backed by a set of append-only logs, so they can be read by many different consumers at different times. IDK if our product can work for you since we don't (yet) support stuff like MQTT, but there's a pretty generous free tier if you'd be able to push the data over HTTP. Either way, I just think it's cool that others have independently arrived at similar ideas about how to model streaming tasks!

  • quix-streams

    A Python library for building containerized ML and Generative AI applications with Apache Kafka.

  • Full disclosure: I work at Quix. I don't always make this recommendation but we could be a really good fit in this case. We enable any number of transformation steps and destinations (outputs/sinks) so you could chain as many as you'd like and be 100% transparent because it's all in code (includes DAG visualiser). We have an open source client library and code samples, we work really well with telemetry/time-series/IoT/sensor data (the founders worked in the McLaren Racing F1 team), we have a Kinesis connector and recently collaborated with AWS to provide a solution for Brompton Bicycle (using Kinesis Firehose). We have a free tier and I'm happy to help you out if you reach limits or have any questions.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • quix-samples

    Library samples repository of Quix. Explore and Deploy them easily on https://portal.platform.quix.ai

  • Full disclosure: I work at Quix. I don't always make this recommendation but we could be a really good fit in this case. We enable any number of transformation steps and destinations (outputs/sinks) so you could chain as many as you'd like and be 100% transparent because it's all in code (includes DAG visualiser). We have an open source client library and code samples, we work really well with telemetry/time-series/IoT/sensor data (the founders worked in the McLaren Racing F1 team), we have a Kinesis connector and recently collaborated with AWS to provide a solution for Brompton Bicycle (using Kinesis Firehose). We have a free tier and I'm happy to help you out if you reach limits or have any questions.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Hangfire – Background Processing in .NET and .NET Core Applications

    1 project | news.ycombinator.com | 3 May 2024
  • Ask HN: Why does Bitwarden not comment their code?

    1 project | news.ycombinator.com | 3 May 2024
  • Your Friendly Guide to Understanding gRPC in .NET with C#

    1 project | dev.to | 3 May 2024
  • Automatic interfaces generation in the C#

    1 project | dev.to | 2 May 2024
  • Vanguard just went live and LoL players are claiming it's bricking their PCs

    1 project | news.ycombinator.com | 2 May 2024