Python Dataflow

Open-source Python projects categorized as Dataflow

Top 14 Python Dataflow Projects

  1. pathway

    Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

    Project mention: GitHub's Fake Star Economy | news.ycombinator.com | 2026-04-20
  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. marimo

    A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.

    Project mention: Pluto.jl 1.0 release – reactive notebook for Julia | news.ycombinator.com | 2026-06-03

    Pluto is great. I use it all the time. If you like the reactivity/reproducibility but are wedded to Python, you might want to check out Marimo, which is also great. [https://marimo.io/]

    It too puts the output of a cell above the code so if you're unable to adapt to things that are different it's also probably not for you.

    FWIW, Observable's Notebooks (Javascript) work the same way: output above the code that produces it. [https://observablehq.com/]

    I too did not like having the output above the code but got over it pretty quickly. For plots, it's arguably better: usually, I want to see the plot before I see the 15 line invocation of some plot command. The thing that bugs me the most about Pluto now is that it really wants you to only have a single evaluating statement per cell. You have to wrap stuff in "block......end" if you want to e.g. define more than one variable in a cell.

  4. pyt

    A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications

  5. bytewax

    Python Stream Processing

    Project mention: Bytewax: Stream processing library built using Python and Rust | news.ycombinator.com | 2026-05-22
  6. ipyflow

    A reactive Python kernel for Jupyter notebooks.

    Project mention: Representing Python notebooks as dataflow graphs | news.ycombinator.com | 2025-08-09

    Even with data flow extension (also like ipyflow [0] ) I am still struggling with the execution model of notebooks in general. I often still see people defining functions and classes in notebooks to somehow handle prototyping loops.

    I would love to see DAGs like in SSA form of compilers, that also supports loop operators. However, IMHO also the notebook interface needs to adjust for that (cell indentation ?). However, the strength of notebooks rather shows in document authoring like quarto, which IMHO mostly contradicts more complex controll flow.

    [0] https://github.com/ipyflow/ipyflow

  7. pytm

    A Pythonic framework for threat modeling

  8. finn

    Dataflow compiler for QNN inference on FPGAs

  9. NIPY

    Workflows and interfaces for neuroimaging packages

  10. prefect-deployment-patterns

    Code examples showing flow deployment to various types of infrastructure

  11. entangle

    A lightweight (serverless) native python parallel processing framework based on simple decorators and call graphs.

  12. flowsaber

    Dataflow based workflow framework

  13. krnel-graph

    Lightweight representation engineering dataflow operations for agent developers.

    Project mention: Show HN: Krnel-Graph", a Library For | news.ycombinator.com | 2025-11-03
  14. pglineage

    pglineage is a tool to create data flow diagrams for PostgreSQL by analyzing SQL

  15. m42pl-core

    A data manipulation language with a focus on flexibility and simplicity.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Dataflow discussion

Log in or Post with

Python Dataflow related posts

  • Show HN: Marimo pair – reactive Python notebooks as environments for agents

    11 projects | news.ycombinator.com | 7 Apr 2026
  • Representing Python notebooks as dataflow graphs

    3 projects | news.ycombinator.com | 9 Aug 2025
  • It's 2025: Your Python Toolbox Is More Than Just PyCharm

    3 projects | dev.to | 31 Jul 2025
  • pathway VS cocoindex - a user suggested alternative

    2 projects | 1 Apr 2025
  • Can anyone tell if Xilinx's FINN (from Xilinx's research lab) is restricted for use only to xilinx based FPGAs?

    2 projects | /r/FPGA | 8 Apr 2023
  • flowsaber, a dataflow-based workflow package written in python

    1 project | /r/Python | 7 May 2021
  • flowsaber, a dataflow-based workflow package written in python. It's extensible, and has a highly intuitive composing syntax, with native shell task support. The whole flows is linked and composed from channels and tasks, different runs of a task with different inputs will be scheduled and ran par

    1 project | /r/programming | 7 May 2021
  • A note from our sponsor - SaaSHub
    www.saashub.com | 14 Jun 2026
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Dataflow projects in Python? This list will help you:

# Project Stars
1 pathway 63,006
2 marimo 21,378
3 pyt 2,206
4 bytewax 1,964
5 ipyflow 1,268
6 pytm 1,127
7 finn 1,007
8 NIPY 826
9 prefect-deployment-patterns 110
10 entangle 104
11 flowsaber 41
12 krnel-graph 22
13 pglineage 17
14 m42pl-core 4

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Python is
the 1st most popular programming language
based on number of references?