flow VS open-data

Compare flow vs open-data and see what are their differences.

flow

🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊 (by estuary)
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
flow open-data
10 25
506 2,221
5.3% 1.2%
9.7 0.0
5 days ago 19 days ago
C++
GNU General Public License v3.0 or later GNU General Public License v3.0 or later
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

flow

Posts with mentions or reviews of flow. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-22.
  • Unexpected downsides of UUID keys in PostgreSQL
    6 projects | news.ycombinator.com | 22 Jun 2023
    We use a macaddr8 that embeds a wall-clock timestamp (so they're ascending order, achieving data locality) with some additional randomness. It's worked really well for us:

    https://github.com/estuary/flow/blob/master/supabase/migrati...

    we use macaddr8 instead of bigint, because it has a postgres serialization / JSON encoding which lossless-ly round-trips with browsers and it works well with PostgREST. The same CANNOT be said for bigint, which is a huge footgun.

  • Need Advice on Real-Time Data Synchronization from PostgreSQL to BigQuery: Airbyte vs. CloudQuery?
    1 project | /r/dataengineering | 16 May 2023
    I can't claim to know much about CloudQuery, but we are an open-source platform with CDC connectors from PostgreSQL and materializations to BQ and elsewhere. We also have fully-managed connectors if you don't want to deal with hosting.
  • DAG orchestration for streaming data?
    3 projects | /r/dataengineering | 10 May 2023
    This is essentially how we model things in Flow (disclosure: I work there). We call them Derivations, which are data products that are built (derived) from other data products. Each data product (we call them Collections) is backed by a set of append-only logs, so they can be read by many different consumers at different times. IDK if our product can work for you since we don't (yet) support stuff like MQTT, but there's a pretty generous free tier if you'd be able to push the data over HTTP. Either way, I just think it's cool that others have independently arrived at similar ideas about how to model streaming tasks!
  • quickly replace a small airbyte instance in my stack
    1 project | /r/dataengineering | 19 Apr 2023
  • Advise on incremental process of Kafka data on Snowflake
    1 project | /r/dataengineering | 15 Apr 2023
    We Estuary Git Docs have an open-source connector for Kafka -> Snowflake that could perform the tasks of a) flattening the data and b) removing duplicates via exactly once end to end delivery
  • Ask HN: Who is hiring? (September 2022)
    20 projects | news.ycombinator.com | 1 Sep 2022
    Estuary Technology | Backend Engineer | Developer Evangelist | Rust, Go | REMOTE OR HYBRID | UTC-7 to UTC+2

    Regional offices in NYC & Columbus, OH

    Estuary (https://www.estuary.dev/) is the first real-time Data Operations platform for future-proof pipelines, including both historical and real-time data set up in minutes.

    Our team is rapidly growing, VC funded and led by two successful, repeat founders.

    We primarily develop in Rust and Go and are heavily built on top of gazette which is an internally developed streaming engine.

    Flow: https://github.com/estuary/flow

    Gazette: https://gazette.readthedocs.io/en/latest/

    Backend Engineer: https://www.estuary.dev/about/#backend

    Developer Evangelist: https://www.estuary.dev/about/#developerevangelist

    ^This is an exciting opportunity to make direct impact and shape user perception of a new product that brings a fresh experience to working with real-time data.

    As this is a unique role, we are open to a variety of personas (data engineers, backend developers, Solutions Engineers and of course DevRel professionals).

    Estuary offers full health benefits, competitive salary, unlimited PTO, 401K, equity, and a culture that values trust, transparency, and a flexible work environment to optimize your work/life balance.

    To apply, send your resume and any questions to [email protected]

  • Who's Hiring? - August 2022
    1 project | /r/golang | 4 Aug 2022
    Flow Gazette We are looking for a backend engineer who is early in their career (around 1-3 years of industry experience) to join our team.
  • Ask HN: Who is hiring? (July 2022)
    13 projects | news.ycombinator.com | 1 Jul 2022
    Estuary Technology | Junior Backend Engineer | Rust, Go | REMOTE OR HYBRID | Regional offices in NYC & Columbus, OH

    Estuary (https://www.estuary.dev/) is the first real-time Data Operations platform for future-poof pipelines, including both historical and real-time data set up in minutes.

    Our team is rapidly growing, VC funded and led by two successful, repeat founders.

    We primarily develop in rust and go and are heavily built on top of gazette which is an internally developed streaming engine.

    Flow: https://github.com/estuary/flow

    Gazette: https://gazette.readthedocs.io/en/latest/

    We are looking for a junior backend engineer with 2-3 years of industry experience.

    For engineers who have an unquenched curiosity and drive to solve complex distributed systems problems, this is an opportunity to advance your career alongside a team of subject matter experts.

    We are focused on expanding our catalog of open-source data connectors and building out our managed service platform.

    ESTIMATED COMPENSATION: $110,000 - $150,000.

    Estuary offers full health benefits, competitive salary, unlimited PTO, 401K, equity, and a culture that values trust, transparency, and a flexible work environment to optimize your work/life balance.

    Email your resume to [email protected] to apply!

  • On 2022-04-05, the default branch will be renamed from “master” to “main”
    4 projects | news.ycombinator.com | 22 Mar 2022
    It does seem like a weird bug that this would cause errors https://github.com/estuary/flow/runs/5642694619?check_suite_... seems like it should be some kind of warning instead of an error?
  • Ask HN: Is there a way to subscribe to an SQL query for changes?
    17 projects | news.ycombinator.com | 22 Apr 2021
    where you'd subscribe for live updates.

    [1]: https://github.com/estuary/flow

open-data

Posts with mentions or reviews of open-data. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-25.
  • How to practice data analytics skills
    3 projects | news.ycombinator.com | 25 Dec 2023
  • [OptaJoe]2009 - Arsenal have won a Premier League game they were losing at half-time outside of London for the first time since December 2009 (2-1 at Liverpool). Temperament.
    3 projects | /r/soccer | 18 Feb 2023
    You can check statsbomb open data but you will to preprocess it from json to sql. They have great course and articles about analyzing the data. Another good reading is awasome-football . They provide list of resources to get data. But the most comprehensive and recommended resources eddwebster's guide. He worked for city football group and his repository updated frequently.
  • Enzo Fernández Progressive Passes - World Cup 2022
    1 project | /r/chelseafc | 1 Feb 2023
    I tried visualising Enzo's progressive passes in each of his world cup matches. I used the data available on StatsBomb for this.
  • Football (soccer) player statistics - looking for free databases
    1 project | /r/datasets | 21 Nov 2022
    https://www.football-data.org/coverage https://datahub.io/collections/football https://github.com/statsbomb/open-data https://www.kaggle.com/datasets/hugomathien/soccer https://www.kaggle.com/datasets/martj42/international-football-results-from-1872-to-2017 https://www.kaggle.com/datasets/secareanualin/football-events https://www.kaggle.com/datasets/adityadesai13/european-football-database-20192020 https://www.kaggle.com/datasets/vivovinco/20212022-football-player-stats https://www.kaggle.com/datasets/antoinekrajnc/soccer-players-statistics
  • Ask HN: Who is hiring? (September 2022)
    20 projects | news.ycombinator.com | 1 Sep 2022
    StatsBomb | Multiple roles | REMOTE, or Bath (UK), or Cairo (Egypt)

    StatsBomb is a sports analytics startup, covering football (both the soccer and American varieties) and soon basketball. We sell data products as well as analysis tools to sports, media and gambling organisations, with a tech pipeline that includes computer vision, machine learning, stream processing, and web-based dataviz. We count many of the biggest names in football as customers, and your work will have a direct impact on our ability to deliver insights to those customers, driving success on the field.

    We're hiring software engineers of various stripes (data pipeline roles with Python and Clojure, full-stack web dev roles with JavaScript) and more besides. We're fully remote, but have offices in Bath, UK and Cairo, Egypt for those that want them. We organise regular team days and also run our own industry-leading conference each year.

    - Apply at: https://statsbomb.com/careers

    If you'd like to find out more about football analytics:

    - Play with our open data: https://github.com/statsbomb/open-data

    - Read our articles: https://statsbomb.com/articles/

    - Browse our conference videos: https://www.youtube.com/channel/UCmZ2ArreL9muPvH49Gaw0Bw

  • [OC] Football Wind ⚽️💨 A wind map visualisation of a typical football game. Each particle is following a force field built from the aggregation of 882,536 passes from 890 matches played in various major leagues/cups.
    1 project | /r/dataisbeautiful | 24 Jun 2022
    The data source providing all the passes is from StatBomb
  • 🏆 TAA vs the u23 world: progressive passes/90 & xA/90
    1 project | /r/FantasyPL | 19 Jan 2022
    If you're familiar with GitHub and JSON then https://github.com/statsbomb/open-data looks decent.
  • Looking for football (soccer) granular datasets
    1 project | /r/datasets | 17 Jan 2022
    The company StatsBomb, which specializes in football analytics, has made a lot of their data available for public use here: https://github.com/statsbomb/open-data I’ve been playing with it recently and I’ve found it to be pretty great.
  • [OC] Lionel Messi's shots and goals with Barcelona during his record-breaking 2011/2012 season, compared to his attempts in the 2014 and 2018 World Cups with Argentina
    2 projects | /r/dataisbeautiful | 21 Dec 2021
    Messi has routinely been one of the best performers in European soccer, including his record-breaking 2011-2012 season in the Spanish league (“La Liga”) with Barcelona, where he set the record for most goals in a season. Unfortunately, success with the Argentina national team has frequently eluded him, finishing as a “runner-up” in the World Cup once and in the Copa America 3 times, before finally winning the Copa America in 2021. Critics often point to his difficulties with his national team as a fatal flaw. I was interested in how his scoring opportunities during arguably his best performance at Barcelona compared to his chances made with Argentina. The data suggests that he is regularly shooting from further away from goal when playing with Argentina when compared to his best performance with Barcelona, which could be a result of a number of factors (different team tactics, difficulty getting up the field, increasing age, less familiarity with teammates, etc.). Data: 2011/2012 La Liga season and World Cup 2018 data were collected from the very nice, public datasets provided by StatsBomb at https://github.com/statsbomb/open-data. The World Cup 2014 data was a bit more difficult to find, but was scraped from the Huffington Post . The StatsBomb data has a ton of great stats to dig into, but because the Huffington Post data had less detail, I wasn't able to go into all of it with just this plot.
  • xG stats for individual shots.
    1 project | /r/SoccerBetting | 24 Jul 2021
    I think Statsbomb has a free API you can use on Github if you request access. https://github.com/statsbomb/open-data

What are some alternatives?

When comparing flow and open-data you can also consider the following projects:

realtime - Broadcast, Presence, and Postgres Changes via WebSockets

opendata - SkillCorner Open Data with 9 matches of broadcast tracking data.

timely-dataflow - A modular implementation of timely dataflow in Rust

geometry-api-java - The Esri Geometry API for Java enables developers to write custom applications for analysis of spatial data. This API is used in the Esri GIS Tools for Hadoop and other 3rd-party data processing solutions.

rethinkdb_rebirth - The open-source database for the realtime web.

sample-data - Metrica Sports sample tracking and event data

pldb - PLDB: a Programming Language Database. A computable encyclopedia about programming languages.

football_analytics - đź“Šâš˝ A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster), including a curated list of publicly available resources published by the football analytics community.

Hasura - Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.

nba-movement-data - SportVU movement tracking data.

github-actions - A GitHub Action for installing and configuring the gcloud CLI.

geomesa - GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.