Stream Processing

Open-source projects categorized as Stream Processing

Top 23 Stream Processing Open-Source Projects

  • mediapipe

    Cross-platform, customizable ML solutions for live and streaming media.

  • Project mention: Mediapipe openpose Controlnet model for SD | /r/localdiffusion | 2023-11-15

    mediapipe/docs/solutions/pose.md at master · google/mediapipe · GitHub

  • vector

    A high-performance observability data pipeline.

  • Project mention: Docker Log Observability: Analyzing Container Logs in HashiCorp Nomad with Vector, Loki, and Grafana | dev.to | 2024-04-19

    job "vector" { datacenters = ["dc1"] # system job, runs on all nodes type = "system" group "vector" { count = 1 network { port "api" { to = 8686 } } ephemeral_disk { size = 500 sticky = true } task "vector" { driver = "docker" config { image = "timberio/vector:0.30.0-debian" ports = ["api"] volumes = ["/var/run/docker.sock:/var/run/docker.sock"] } env { VECTOR_CONFIG = "local/vector.toml" VECTOR_REQUIRE_HEALTHY = "false" } resources { cpu = 100 # 100 MHz memory = 100 # 100MB } # template with Vector's configuration template { destination = "local/vector.toml" change_mode = "signal" change_signal = "SIGHUP" # overriding the delimiters to [[ ]] to avoid conflicts with Vector's native templating, which also uses {{ }} left_delimiter = "[[" right_delimiter = "]]" data=<

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • awesome-bigdata

    A curated list of awesome big data frameworks, ressources and other awesomeness.

  • Project mention: Good coding groups for black women? | news.ycombinator.com | 2024-01-13
  • redpanda

    Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!

  • Project mention: Choosing Between a Streaming Database and a Stream Processing Framework in Python | dev.to | 2024-02-10

    Stream-processing platforms such as Apache Kafka, Apache Pulsar, or Redpanda are specifically engineered to foster event-driven communication in a distributed system and they can be a great choice for developing loosely coupled applications. Stream processing platforms analyze data in motion, offering near-zero latency advantages. For example, consider an alert system for monitoring factory equipment. If a machine's temperature exceeds a certain threshold, a streaming platform can instantly trigger an alert and engineers do timely maintenance.

  • awesome-system-design

    A curated list of awesome System Design (A.K.A. Distributed Systems) resources.

  • Project mention: Ask HN: Resources to learn boring architecture for a small startup? | news.ycombinator.com | 2023-12-25

    https://github.com/madd86/awesome-system-design

  • Benthos

    Fancy stream processing made operationally mundane

  • Project mention: Ask HN: Who is hiring? (December 2023) | news.ycombinator.com | 2023-12-01
  • watermill

    Building event-driven applications the easy way in Go.

  • Project mention: Microservices communication | /r/golang | 2023-12-09

    I’ve successfully worked on projects using an asynchronous event-driven way of connecting services. I really like the decoupling of business logic and the events triggering it. I highly recommend https://github.com/ThreeDotsLabs/watermill to be more flexible when it comes to choosing the actual technology driving the async patter. It might be NATS today but requirements might change and you need to change. Watermill prepares you for this.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Faust

    Python Stream Processing

  • Project mention: Faust VS quix-streams - a user suggested alternative | libhunt.com/r/faust | 2023-12-07
  • risingwave

    Cloud-native SQL stream processing, analytics, and management. KsqlDB and Apache Flink alternative. 🚀 10x more productive. 🚀 10x more cost-efficient.

  • Project mention: Proton, a fast and lightweight alternative to Apache Flink | news.ycombinator.com | 2024-01-30

    How does this compare to RisingWave and Materialize?

    https://github.com/risingwavelabs/risingwave

  • Hazelcast

    Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.

  • Project mention: Does anyone know any good java implementations for distributed key-value store? | /r/ExperiencedDevs | 2023-06-08

    You're probably looking for Hazelcast here. Note that it does much more than just a distributed k/v, but it will get you where you need to go.

  • ksql

    The database purpose-built for stream processing applications.

  • materialize

    The data warehouse for operational workloads. (by MaterializeInc)

  • Project mention: Ask HN: How Can I Make My Front End React to Database Changes in Real-Time? | news.ycombinator.com | 2024-04-17

    [2] https://materialize.com/

  • fluent-bit

    Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows

  • Project mention: Observability at KubeCon + CloudNativeCon Europe 2024 in Paris | dev.to | 2024-03-26

    Fluentbit

  • hudi

    Upserts, Deletes And Incremental Processing on Big Data.

  • Project mention: Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog | dev.to | 2023-12-18

    Apache Iceberg is one of the three types of lakehouse, the other two are Apache Hudi and Delta Lake.

  • river

    🌊 Online machine learning in Python

  • Project mention: 🔍Underrated Open Source Projects You Should Know About 🧠 | dev.to | 2024-03-20

    River is a Python library for online machine learning. Online machine learning can dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., stock price prediction, content personalization.

  • danfojs

    Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

  • arroyo

    Distributed stream processing engine in Rust

  • Project mention: FLaNK AI Weekly 18 March 2024 | dev.to | 2024-03-18
  • dpark

    Python clone of Spark, a MapReduce alike framework in Python

  • fluvio

    Lean and mean distributed stream processing system written in rust and web assembly.

  • Project mention: Ask HN: WebSocket Relay? | news.ycombinator.com | 2024-02-27
  • PipelineDB

    High-performance time-series aggregation for PostgreSQL

  • Project mention: PostgreSQL Is Enough | news.ycombinator.com | 2024-02-06
  • awesome-streaming

    a curated list of awesome streaming frameworks, applications, etc

  • Memgraph

    Open-source graph database, tuned for dynamic analytics environments. Easy to adopt, scale and own.

  • Project mention: Ask HN: Who is hiring? (March 2024) | news.ycombinator.com | 2024-03-01

    Memgraph | Staff C++ Database Engineer | REMOTE (Central/Western Europe, LatAm, or North America) https://memgraph.com/

    Memgraph is a Seed stage, open source graph database vendor. Graph DBs are a great solution for GenAI, logistics, cybersecurity and fintech so we are looking to grow aggressively this year.

    We're looking for a staff-level engineer to set technical direction, mentor junior team members, and solve some very difficult problems.

    Either DM me (the hiring manager) or apply here: https://join.com/companies/memgraph/10684850-staff-software-...

  • go-streams

    A lightweight stream processing library for Go

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Stream Processing related posts

Index

What are some of the best open-source Stream Processing projects? This list will help you:

Project Stars
1 mediapipe 25,405
2 vector 16,427
3 awesome-bigdata 12,792
4 redpanda 8,784
5 awesome-system-design 8,297
6 Benthos 7,559
7 watermill 6,729
8 Faust 6,674
9 risingwave 6,283
10 Hazelcast 5,861
11 ksql 5,811
12 materialize 5,567
13 fluent-bit 5,321
14 hudi 5,066
15 river 4,766
16 danfojs 4,649
17 arroyo 3,275
18 dpark 2,691
19 fluvio 2,638
20 PipelineDB 2,603
21 awesome-streaming 2,557
22 Memgraph 2,086
23 go-streams 1,753

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com